• Functional-light-js
  • Kyle Simpson is the author of You-Dont-Know-JS

This is a pure project with Hujiang’s blood flowing: earnest is the most solid pillar of HTML; Sharing is the most shining glance in CSS; Summary is the most rigorous logic in JavaScript. After hammering and honing, the Chinese version of this book is achieved. This book contains the essence of functional programming, hoping to help you learn functional programming on the road to go more smoothly. finger heart

Team of Translators (in no particular order) : Ahi, Blueken, Brucecham, CFanlife, Dail, Kyoko-DF, L3VE, Lilins, LittlePineapple, MatildaJin, Holly, Pobusama, Cherry, radish, VavD317, Vivaxy, MOE MOE, zhouyao

Chapter 9: Recursion (Ii)

Stack, heap

Let’s look at the previous two recursive functions isOdd(..). And isEven (..) :

function isOdd(v) {
	if (v === 0) return false;
	return isEven( Math.abs( v ) - 1 );
}

function isEven(v) {
	if (v === 0) return true;
	return isOdd( Math.abs( v ) - 1 );
}
Copy the code

Most browsers will get an error if you execute the following line:

isOdd( 33333 );			// RangeError: Maximum call stack size exceeded
Copy the code

What is the case with this error? The engine throws this error because it is trying to protect system memory from being exhausted by your program. To explain this, we need to first look at what happens in the JS engine when a function is called.

Each function call opens up a small area of memory called a stack frame. The stack frame contains some important information about the current state of the function statement, including the value of any variable. This happens because one function pauses to execute another function, and when the other function is finished, the engine needs to return to the state where it was suspended.

When the second function starts executing, the stack frame increases to two. If the second function calls another function, the stack frames increase to three, and so on. “Stack” means that when a function is called by a function before it, that function frame is “pushed” to the top. When the function call ends, its frame exits from the stack.

Take a look at this program:

function foo() {
	var z = "foo!";
}

function bar() {
	var y = "bar!";
	foo();
}

function baz() {
	var x = "baz!";
	bar();
}

baz();
Copy the code

Imagine the program’s stack frame step by step:

Note: If these functions are not called to each other, but are executed sequentially — for example, the next function baz() is called after the previous function has finished; bar(); foo(); The stack frame is not generated; Because before the next function starts, the previous function finishes and removes its frame from the stack.

So, every time a function runs, it takes up some memory. For most programs, that’s not a big deal, is it? But once you reference recursion, the problem is different. While you almost certainly won’t manually call thousands (or hundreds) of different functions in a call stack, you can easily see stacks that produce tens of thousands or more recursive calls.

When the engine decides that the call stack has increased too much and should be stopped, it blocks the current step with a subjective constraint, so isOdd(..) Or isEven (..) The function throws a RangeError unknown error. This is unlikely to be a limit due to near-zero memory, but rather a prediction of the engine, because memory will explode if the program continues to run. Since the engine cannot tell whether a program will eventually stop, it must make a deterministic guess.

The engine limit depends on the situation. There is nothing in the specification, therefore, it is not required. But without a limit, the device is vulnerable to sabotage or malicious code attacks, so almost all JS engines have a limit. Different device environments and different engines have different limits, and it is impossible to predict or guarantee how many times a function call stack can be called.

This limitation requires recursive performance for developers when dealing with large volumes of data. I think this limitation is probably the biggest reason why developers don’t like to use recursive programming. Unfortunately, recursive programming is a programming idea rather than a dominant programming technique.

The tail call

Recursive programming and memory limitations predate JS technology. Back in the 1960s, developers wanted to use recursive programming and wanted to run devices on their powerful computers, which had far less memory than we have on watches today.

Fortunately, in that hopeful field, a strong observation was made. This technique is called tail call.

The idea is that if a callback goes from function baz() to function bar() and the callback is executed at the very bottom of function baz() — the last call — then the stack frame for baz() is no longer needed. This means that memory can be reclaimed, or simply by executing the bar() function. As shown in the figure:

Tail calls are not recursion-specific; It applies to any function call. However, in most cases, your manual non-recursive call stack is unlikely to exceed 10 levels, so the impact of tail-calls on your program’s memory is likely to be quite low.

In the recursive case, tail-calls are obvious because it means that the recursive stack can run “forever” and the only performance issue is computation, not fixed memory limits. Tail recursion can run O(1) in fixed memory (constant time complexity calculation).

These techniques are often referred to as tail-call optimization (TCO), but the point is to distinguish from optimization techniques the ability to detect tail-call execution in fixed memory space. Tail calls are not technically what most people think, and they can run slower than normal callbacks. TCO is about optimization techniques for making tail calls run more efficiently.

Correct tail call (PTC)

Until ES6 came out, trailing calls in JavaScript were not explicitly specified (nor banned). The specific form of PTC is specified in ES6, where stack overflow does not occur as long as tail calls are used. In practice, this means that exceptions such as RangeError are not raised as long as PTC is used correctly.

First, to apply PTC in JavaScript, the code must be written in strict mode. If you haven’t used Strict mode before, you should try it. You, then, should already be using strict mode! ?

Second, a proper tail call would look like this:

returnfoo( .. ) ;Copy the code

In other words, the function call should be executed last and return whatever it returns. In this case, JS no longer needs the current stack frame.

The following cannot be called PTC:

foo();
return;

/ / or

varx = foo( .. ) ;return x;

/ / or

return 1+ foo( .. ) ;Copy the code

Note: Some JS engines can take var x = foo(); return x; Return foo(); , which can also achieve the effect of PTC. But this is not the norm.

foo(..) The 1+ part is not executed until the end of the run, so the stack frame is still there.

However, this is PTC:

returnx ? foo( .. ) : bar( .. ) ;Copy the code

After x makes the conditional, or executes foo(..) , or implement bar(..) The result is returned by the return, no matter which one is executed. This example conforms to the PTC specification.

To avoid stack growth, PTC requires that all recursion be called at the tail, so binary recursion — two (or more) recursion calls — is not PTC. We showed an example of turning dichotomous recursion into reciprocal recursion earlier in the article. Perhaps we can try to break down multiple recursions into single function callbacks that conform to the PTC specification.

Refactoring recursive

If you want to handle problems recursively, but you’re out of the JS engine’s memory stack, you need to refactor your recursive calls to make them PTC compliant (or avoid nested calls). There are some refactoring methods you might use, but they need to be weighed against the actual situation.

Readable code is our ultimate goal — remember, remember. Don’t use recursion if it makes the code difficult to read/understand; Let’s do it in an easier way.

Replace the stack

The main problem with recursion is its memory usage. The stack frame keeps track of the state of the function call and dispatches it to the next recursive call. If we figure out how to rearrange our recursion, we can implement recursion with PTC and take advantage of the JS engine’s optimized handling of tail calls, then we won’t have to keep the current stack frame in memory.

Let’s recall an example of a sum we used earlier:

function sum(num1,... nums) {
	if (nums.length == 0) return num1;
	returnnum1 + sum( ... nums ); }Copy the code

This example does not conform to the PTC specification. sum(… After nums is executed, num1 and sum(… Nums) are accumulated. That way, when the rest of the parameters… The next time NUMS makes a recursive call, the stack frame from the previous recursive call must be preserved in order to get the sum of num1.

The key point of the refactoring strategy is that we can eliminate the dependency on the stack by changing the post-processing stack to pre-processing, and then passing that partial result as an argument to the recursive call. In other words, we don’t need to keep num1 + sum(… Num1), but passes it to the next recursive stack frame, freeing the current recursive stack frame.

To begin, we make a change: we pass the partial result as a new first argument to the function sum(..). :

function sum(result,num1,... nums) {
	// ..
}
Copy the code

Result = num1; num1 = num1; num1 = num1; num1 = num1

"use strict";

function sum(result,num1,... nums) {
	result = result + num1;
	if (nums.length == 0) return result;
	returnsum( result, ... nums ); }Copy the code

Now the sum (..) PTC optimization specification already met! Yeah!

However, there is another disadvantage, after we change the function parameter passing form, the usage is not the same as before. The caller has to pass a 0 as the first argument in front of the arguments to be summed.

sum( /*initialResult=*/0.3.1.17.94.8 );		/ / 123
Copy the code

This is awkward.

The usual solution is to rename the awkward recursive function and define an interface function to hide the problem:

"use strict";

function sumRec(result,num1,... nums) {
	result = result + num1;
	if (nums.length == 0) return result;
	returnsumRec( result, ... nums ); }function sum(. nums) {
	return sumRec( /*initialResult=*/0. nums ); } sum(3.1.17.94.8 );								/ / 123
Copy the code

Things are better. But there’s still a problem: we’re using two functions to do what we used to do with just one function. Sometimes you’ll find developers “hiding” the recursion with internal functions when dealing with this type of problem:

"use strict";

function sum(. nums) {
	return sumRec( /*initialResult=*/0. nums );function sumRec(result,num1,... nums) {
		result = result + num1;
		if (nums.length == 0) return result;
		returnsumRec( result, ... nums ); } } sum(3.1.17.94.8 );								/ / 123
Copy the code

The drawback of this method is that the external function sum(..) is called each time. , we all have to recreate the internal function sumRec(..) . We can place them horizontally in a function that executes immediately, exposing only the function we want:

"use strict";

var sum = (function IIFE(){

	return function sum(. nums) {
		return sumRec( /*initialResult=*/0. nums ); }function sumRec(result,num1,... nums) {
		result = result + num1;
		if (nums.length == 0) return result;
		returnsumRec( result, ... nums ); }}) (); sum(3.1.17.94.8 );								/ / 123
Copy the code

Ok, now that the PTC specification is met, sum(..) is guaranteed. Arguments are neat and callers do not need to know the internal implementation details of a function. Perfect!

But… Oh, my God, it was a simple recursive function, and now there’s a lot of noise. Readability has decreased markedly. It was unsuccessful, to say the least. Sometimes, it’s just the best we can do.

Fortunately, in some other cases, like the last one, there’s a better way. Watch it again:

"use strict";

function sum(result,num1,... nums) {
	result = result + num1;
	if (nums.length == 0) return result;
	returnsum( result, ... nums ); } sum(/*initialResult=*/0.3.1.17.94.8 );		/ / 123
Copy the code

You might notice that result is just like num1, that is, we can use the first number in the list as our running sum; This includes even the first call. What we need is to rename these parameters to make the function legible:

"use strict";

function sum(num1,num2,... nums) {
	num1 = num1 + num2;
	if (nums.length == 0) return num1;
	returnsum( num1, ... nums ); } sum(3.1.17.94.8 );								/ / 123
Copy the code

Handsome. Much better than before, eh? ! I think this model strikes a good balance between claim/reasonableness and execution.

Let’s try refactoring the previous maxEven(..) (It does not meet PTC specifications at present). Just as we took the sum of the arguments as the first argument, we can successively reduce the number in the list, all the while taking the largest even number encountered as the first argument.

For clarity, we might use an algorithmic strategy (similar to the one we discussed earlier) :

  1. First, take the first two parametersnum1num2Compare.
  2. ifnum1It’s even andnum1Is greater thannum2.num1Stay the same.
  3. ifnum2It’s an even numbernum2Assigned tonum1.
  4. Otherwise,num1Is equal to theundefined.
  5. If there are other parameters besides these two parametersnumsAnd compare them withnum1Do a recursive comparison.
  6. And finally, whatever value it is, it just returns, rightnum1.

Following the steps above, the code is as follows:

"use strict";

function maxEven(num1,num2,... nums) {
	num1 =
		(num1 % 2= =0 && !(maxEven( num2 ) > num1)) ?
			num1 :
			(num2 % 2= =0 ? num2 : undefined);

	return nums.length == 0? num1 : maxEven( num1, ... nums ) }Copy the code

Note: the function is first called maxEven(..) Not for PTC optimization, when it only passes num2, only one recursive level is returned; It’s just a trick to avoid repeating % logic. Therefore, as long as the call is a completely different function, there is no increase in the recursive stack. The second call maxEven(..) Is a truly recursive call from a PTC optimization perspective, and therefore does not increase the stack as the recursion proceeds.

Again, this example is intended only to illustrate a way to convert recursion into compliance with the PTC specification to optimize stack (memory) usage. A more straightforward way to find the maximum even value might be to filter the NUMs in the parameter list and then bubble or sort.

Refactoring recursion based on PTC does have some impact on simple declarative form, but there are still reasons to do it. Unfortunately, there is some recursion that doesn’t work well even if we use interface functions to extend it, so we need to think differently.

Subsequent Delivery Format (CPS)

In JavaScript, the word continuation is often used to refer to a callback function that specifies the next step to execute after a function has completed. Organizing code so that each function receives another execution function at its end is called the successor transfer format (CPS).

Some forms of recursion, especially mutual recursion, are virtually impossible to reconstruct according to pure PTC specifications. We mentioned earlier that fib(..) Function, and the reciprocal recursive form that we derive. In both cases, there are multiple recursive calls that impede PTC memory optimization.

However, you can make the first recursive call and include subsequent recursive calls in subsequent functions and pass them to the first call. Although this means that more functions will eventually need to be executed on the stack, the stack memory usage will not grow indefinitely because all subsequent functions are in PTC form.

The fib (..) Make the following modifications:

"use strict";

function fib(n,cont = identity) {
	if (n <= 1) return cont( n );
	return fib(
		n - 2,
		n2 => fib(
			n - 1,
			n1 => cont( n2 + n1 )
		)
	);
}
Copy the code

Take a close look at what you’ve done. First, we use cont(..) from Chapter 3 by default. The successor function represents identity(..) ; Remember, it simply returns whatever was passed to it.

More importantly, not one but two subsequent functions have been added. The first subsequent function receives the result of the fib(n-2) run as argument n2. The second internal subsequent function takes the result of the fib(n-1) run as parameter n1. When the values of n1 and n2 are obtained, the two are added together (n2 + n1), and the result of the operation is passed to the next subsequent function cont(..). .

Perhaps this will help us to sort out the flow: as we discussed earlier, after the recursive stack, when we pass partial results instead of returning them, each step is contained in a subsequent function, which slows down the computation. This technique allows us to perform multiple steps that conform to the PTC specification.

In static languages, CPS generally provides an opportunity for tail calls that the compiler can automatically recognize and rearrange recursive code to take advantage of. Unfortunately, this does not work with native JS.

In JavaScript, you have to write your own CPS code. That’s not smart; The declaration in the form of a command symbol is bound to make things a little unclear. But in general, this form is still more declarative than a for loop.

Warning: One of the more important things to note is that in CPS, creating additional internal subsequent functions still consumes memory, but with a difference. Rather than accumulating previous stack frames, closures simply consume excess memory (in general, excess memory in the stack). In these cases, the engine doesn’t seem to have RangeError limits enabled, but that doesn’t mean your memory usage is proportionally fixed.

Spring bed

In addition to CPS subsequent delivery formats, another technique for memory optimization is called a spring bed. In the spring-bed code, subsequent functions similar to CPS are created, except that they are not passed, but simply returned.

No longer does a function call another function, and the stack is no deeper than one layer, because each function returns only the next function to be called. The loop simply continues to run each returned function until there are no more functions to run.

One of the advantages of spring beds is that you can use this technique in a non-PTC environment. Another advantage is that each function is called normally instead of PTC optimization, so it can run faster.

Let’s try trampoline. :

function trampoline(fn) {
	return function trampolined(. args) {
		varresult = fn( ... args );while (typeof result == "function") {
			result = result();
		}

		return result;
	};
}
Copy the code

When a function is returned, the loop continues, executing the function and returning its result, and then checking the type of result returned. Once the result type returned is not a function, the spring bed considers the function call complete and returns the result value.

So we may need to use the previous trick of passing partial results as parameters. Here is an example of how we used this technique in our previous array summation:

var sum = trampoline(
	function sum(num1,num2,... nums) {
		num1 = num1 + num2;
		if (nums.length == 0) return num1;
		return (a)= > sum( num1, ...nums );
	}
);

var xs = [];
for (let i=0; i<20000; i++) { xs.push( i ); } sum( ... xs );/ / 199990000
Copy the code

The disadvantage is that you need to wrap recursive functions in functions that perform the spring bed function; Also, just like CPS, you need to create closures for each subsequent function. Unlike CPS, however, each subsequent count returned is run and completed immediately, so there are no more and more closures accumulating in the engine as the depth of the call stack is exhausted.

Spring bed technology, except for the execution and memory performance is better than that of the CPS has the advantage of them in the statement in the form of the recursive less invasive, because you don’t have to receive the follow-up function parameters and change the function, so in addition to the execution and memory performance, spring bed where technology is better than that of the CPS and their recursive form less invasive to the statement. While spring-bed techniques are not ideal, they can effectively strike a balance between command loop code and declarative recursion.

conclusion

Recursion is when a function calls itself recursively. Well, that’s the definition of recursion. Get it! ?

Direct recursion refers to calling itself at least once until basic conditions are met. Multiple recursion (like binary recursion) refers to making multiple calls to itself. Reciprocal recursion is when two or more functions loop recursively to each other. Recursion, on the other hand, has the advantage of being more declarative and therefore generally easier to read.

The advantage of recursion is that it is more declarative and therefore generally easier to read. The downside is usually performance, but the more limited aspect is memory than execution speed.

Tail calls save memory space by reducing or freeing stack frames. To implement tail-call “optimization” in JavaScript, you need to be based on strict patterns and appropriate tail-call (PTC). We can also mix several techniques to refactor non-PTC recursive functions into PTC format, or at least save memory space by tiling the stack.

Remember: Recursion should make code easier to read. If you misuse or abuse recursion, the code will be worse to read than the command form. Don’t do that.

* * 【 】 in the previous chapter translation serial | chapter 9: recursion (on) – “JavaScript lightweight functional programming” | “you don’t know the JS” companion piece * *

IKcamp’s original new book “Mobile Web Front-end Efficient Development Combat” has been sold on Amazon, JD.com and Dangdang.

Hujiang Web Front-end Shanghai team is looking for a Web front-end architect. Your resume can be found at zhouyao@hujiang.com

IKcamp website: www.ikcamp.com


In 2019, iKcamp’s original new book Koa and Node.js Development Actual Combat has been sold on JD.com, Tmall, Amazon and Dangdang!