In Java, the creation of functions (functions declared in a class, which we prefer to call methods, behaviors) depends on a class, an abstract class, or an interface. Scala, on the other hand, promotes functional programming, where functions are treated as “first-class citizens” and can be used like variables, either as arguments to a function or as functions assigned to another variable.

Of course, before we get started, instead of boring grammar, take five minutes to think about what FP has to offer in these days of OOP.

In fact, the debate between OOP and FP has been going on for so long that I haven’t been qualified to advise either side of it, so I’m going to take the middle ground here: pure OOP and pure FP are both extremes, and “when things go too far, they go too far.” Also, there is no point in talking about programming paradigms outside of the context of actual application development.

The summary of FP and OOP comes from the following excellent articles, and there are many great people on Zhihu.

  • Segmentfault: What is the difference between procedural programming and functional programming?
  • Segmentfault: Immature idea of the difference between FP and OOP
  • Zhihu: How to understand the idea of object-oriented programming in nature?
  • Zhihu: What are the disadvantages of object-oriented programming?
  • Cnblogs: Why do we need to know “Functional Programming”?
  • Zhihu: What is functional programming thinking?
  • CSDN: Why didn’t functional programming catch on?

Before we get started: Why OOP programming

OOP is a design idea, not a Java/C++ language privilege. Let’s take a look at Alan Key, the father of OO languages:

I thought of objects being like biological cells and/or individual computers on a network, only able to communicate with messages (so messaging came at the very beginning — it took a while to see how to do messaging in a programming language efficiently enough to be useful). … OOP to me means only messaging, local retention and protection and hiding of state-process, and extreme late-binding of all things. It can be done in Smalltalk and in LISP.

“I think OF OOP as a network structure (like a cell forming an organization, or a computer forming an Internet) where each node communicates by sending messages. . OOP to me is just that each object holds internal hidden information, but allows limited communication by exchanging information.

OOP ideas are ideal for writing large and complex systems. A classic example is the Linux kernel, which fully embodies multiple relatively independent components (process scheduler, memory manager, file system…). The idea of working together. ** Although the Linux kernel is written in C, it is more OOP ** than many programs written in so-called OOP languages.

The advantage of OOP comes only when it is impossible to have a “God view” of the whole and the system has to be broken up into many objects. For example, the company should be divided into many departments, each of which is relatively independent and has its own articles of association, handling methods and rules. Departments “hide their internal state”, that is, they are independent from each other. For example, if you want a department to do something according to the rules, you don’t care about the internal details of “who will do it” or “how they will do it”, you need to know that they will do it for you.

The system maintained through this loose collaboration can be infinitely extended to form a large, complex system. That’s what OOP wants to say, nothing more. I’m not a big proponent of combining OOP with “inheritance, encapsulation, polymorphism” : If OOP is just OOP for OOP’s sake, the maintenance costs of a complex system will be higher.

As an aside, Scala’s designers seem less than happy with Java’s inheritance mechanism. Take three generations of inheritance: grandfather, father, son. The father inherits attributes/methods from the grandfather, but the son may not be as interested in these attributes. However, according to strict inheritance laws, the son has to accept this heavy “father love”, redundant and useless attributes/methods along the way.

We’ll revisit this issue when we introduce Scala’s Trait traits.

What does FP bring?

Functional programming is more concerned with the mapping of data: I enter an x and hope to get a corresponding y through f(x) transformation. And in an ideal world, as long as the x I put in stays the same, the y I get is never going to change. We often combine multiple functions into a closed, long pipeline, describing a process like this:

/ / / / without any global variables defined function f (cabbage, potatoes, carrots) = > (rice) g (eat) = > if (eat! = null) (full) else (hungry) h (hungry) = > if (hungry = = full) (work) else (fishing) -- -- -- -- -- -- -- the Main -- -- -- -- -- -- -- -- -- -- -- - h (g (f (cabbage, potatoes, carrots)))Copy the code

Wait, isn’t that procedural programming?

Functional programming emphasizes isolation of side effects, that is, data that does not depend on or change beyond the current function. All the other “functional” stuff comes from this: pure functions, currization of functions, higher-order functions. (We’ll show how Scala uses them in the advanced functions section that follows.)

Procedural programming uses external variables to represent state instead of passing in parameters. Method affects nearby code by changing external variables, not by returning values. To figure out what the method does, the reader must read each line carefully. If you find an external variable, you must look for its origin, find out what methods modified it, and know the consequences: what ripple effects the change in the variable can have.

In functional programming, all the magic is inside this function. You only need to know one thing: what you type and what you get for it.

(The following is the process-oriented solution as I understand it)

// Global variables dinner and hungry eat = null; Hungry state = true; // Define the function f(cabbage, potato, carrot) => eat = dinner; G () => if =null) Hunger state = false; H () => if(hungry! =true) (work) else (touch fish); -------Main------------ f(Cabbage, potato, carrot); g(); h();Copy the code

In functional languages, each step produces a state that is passed as a parameter to the next step, rather than using global variables.

Finally, when FP works: if it’s “step one”, “Step two”… FP is more suitable than OOP. Its specific applications include Spark RDD, Java 8 Stream processing, and so on.

Define a function in Scala

In order to learn Scala, we need to take the first step: how to define a Scala function. Scala functions can be defined like this:

// Define the full writing of the function.
def function(p: ParamType) : ReturnType= {... }// When defining a function, omit the return value type.
This does not mean that a value is not returned, but depends on the last statement of this function that can return a value.
def function(p : ParamType) = {... }// Note that this is the empty parenthesis function.
def function() = {... }// Note that this is a parameterless function.
def funtion = {... }// This function allows multiple argument lists. (This declaration is related to the Currization of functions.)
def function() () () = {... }Copy the code

In contrast to the serious method declarations of Java, Scala’s function definitions are quite arbitrary. (Not taking into account generic parameters, default parameter assignments, implicit parameters, etc……) But for now, let’s just keep the first two in mind.

Scala allows functions to be defined without a specific return value type, so that Scala uses type inference to determine the return value of a function.

  1. If Scala knows that this function has a return value, but cannot infer the detailed type, the default return value isAnyType.
  2. If Scala does not consider this function to have a return value, the default return value isUnit
  3. Scala’s argument list is written like this: the parameter name comes first, then:Delimit, followed by the parameter type.
  4. Scala does not need to be used explicitlyreturnThe keyword represents the return value.

In addition, Scala can define functions inside functions, like this:

def main(args: Array[String) :Unit = {
  def inner() :Unit ={
    print("...")}}Copy the code

I define an inner function inside main and use it only in that scope.

Scala functions can have default arguments

We can define the function argument list directly after the = sign to assign a default value.

def greet(word : String = "Scala") :Unit ={
  println(s"Hello,$word!")}//---Main---//
greet()
Copy the code

So even if we call the greet function, the greet() function prints Hello,Scala!

But if we do this, the following code will report an error.

def greet(word1 : String = "Scala", word2 : String) :Unit ={
  println(s"$word1.$word2!")}//----Main-----//
greet("hello")
Copy the code

The reason for this is that the compiler is not sure whether “hello” is intended to override the default assignment of word1 or to assign to word2. We might want word1 to take the default argument and assign the “hello” string to word2.

To avoid such misunderstandings, we need to explicitly specify the variable assigned by “hello” when assigning.

//word1 has a default value, we explicitly declare hello to be assigned to word2, and there should be no problem.
// The screen will print: Scala, Hello!
greet(word2 = "hello")
Copy the code

Remember that in Scala, every argument to a function is of type val.

Mutable arguments in Scala functions

A mutable parameter in Java is actually an array.

The way to verify this is to copy the code below under the same class and try to overload it, but the compilation fails. Indicates that the compiler considers the undecided participation array to be of the same type.

Note: Overloading only evaluates method names and argument lists. Has nothing to do with return values and modifiers.

/ / Java used... To represent a variable parameter.
public void numbers(int. Ints){
    for (int i:Ints) System.out.println(i);
}

// The overload failed, indicating that the Java compiler thinks the function's signature is the same.
public void numbers(int[] Ints){
    for (int i:Ints) System.out.println(i);
}
Copy the code

Alternatively, an array can be used to assign to an indefinite parameter, but conversely, multiple parameters cannot be assigned to an array parameter.

// Variable parameter type
public static void numbers(int. Ints){
    for (int i:Ints) System.out.println(i);
}

// This is allowed
int[] arr =  {1.2.3.5.6};
numbers(arr);
//------------------------------------------------------------------
// Array parameters
public static void numbers(int[] Ints){
    for (int i:Ints) System.out.println(i);
}

// Incorrect usage
numbers(1.2.3.4.5);
Copy the code

In addition, in order to avoid the problem of assignment confusion caused by variable parameters, when the function has other parameters, the variable parameter is assigned last.

// Correct statement
public static void numbers(int a,int. Ints){}

// False statement
public static void numbers(int. Ints,int a){}
Copy the code

Indeterminate parameter assignment in Scala is the same as in Java, but written differently. The type is followed by an * sign to indicate that it is an indeterminate parameter.

//Scala Type. Note that mutable arguments come after.
def numbers(int: Int, ints: Int*) :Unit = {
  println(S "input${ints.length+1}A parameter")}Copy the code

Confusing situations

// Is STR a variable?
def str = "This is a String."
Copy the code

Is STR in the code above a string? And it isn’t. It’s just that because it’s a simple function that just returns a string, all the nuances are cut out and it looks like declaring a variable.

Is STR a string or a wrapped method? As callers of the code, we don’t really care that much. All we know is that we can get a string by calling STR. So let’s make it a little bit more complicated:

def str = {
  
  var temp =   "this is a string."
  
  temp + "(but not actually)"
  
  temp + "well, but they don't know the detail~"
  
  temp
}

//---- As far as the caller of the code is concerned, it is not aware that STR is complicated to get. -----//
println(str)
Copy the code

To the user, the STR variable looks like a string. This is an illusion created by parameterless functions. But its original intention was not to confuse the masses, but to do with the Uniform Access Principle. I’ll cover this principle in more detail in the advanced functions section, as well as the use of empty parenthesis and no-argument functions.

The inertia function

Don’t let a loaded rifle appear if you’re not going to fire. This theory is known as Chekhov’s Gun.

Lazy computing is a feature of many functional programming languages (such as Spark’s lazy invocation example). It allows time-consuming calculations to be deferred until absolutely necessary. It is called “lazy”, but in practice it improves system efficiency by avoiding loading many large programs/files that will not be used.

An example of lazy invocation in life: turn on the tap only when water is needed, and the water will flow down; When not in use, the faucet is turned off, naturally no water flow.

Consider another Lazy Java singleton: Suppose there is a student (I) who does his homework only when it is due:

/** * He is a lazy and cunning student. */
public class Student {
    private String Homework = null;
    public String handOn(a){
        if(Homework ==null)
        {
            Homework = "I barely finished it on time ;) ";
            return Homework;
        }
        returnHomework; }}Copy the code

Explore Scala’s lazy loading

First, define a plain variable string, which is assigned using the getString method. The main function then prints a line: print a line. We mark the getString method with a print command so that we can observe the actual execution order of the program.

def main(args: Array[String) :Unit = {
  val string :String = getString
  println("print a line.")}def getString:String={
  println("This variable has been assigned.")
  "I'm a lazy boy"
}
Copy the code

The console prints two lines of statements:

This variable has been assigned.
print a line.
Copy the code

We don’t actually use the string variable, but the main function still calls the getString method and initializes it. The hope is that if the string variable is not used, the main function will not load it at all. At this point, the variable is preceded by a lazy keyword, indicating that its initialization is delayed until the moment it is called.

lazy val string :String = getString
Copy the code

Executing the executor again, you can see that since String is not used, getString is not executed either.

print a line.
Copy the code

Will string be preloaded if we execute the program in the following order?

lazy val string :String = getString
println("print a line.")
println(s"$string")
Copy the code

The answer is no. Note: The lazy string is initialized only at the moment it is called. So getString is called only after “print a line” is printed.

Lazy also works for variable declarations

Lazy can also be used on variables of type val (in other words, lazy cannot be used on variables of type var). It does the same thing: it is initialized only when necessary.

Recursive function

The flow of nested calls to functions

For example, a procedure with nested calls (pseudocode below) :

fun1 ={
	fun2
}

fun2 = {
	fun3
}

fun3 = {
	return value
}
------main--------
call fun1
Copy the code

When fun1 is executed, a new data space is created in memory for fun1 to use. But since fun1 in turn needs the return of fun2, a break point is set up here, and a new data space for fun2 is opened and entered (since fun2 cannot continue until it is finished). Fun2, in turn, needs the return of fun3, so a break point is set up, and the PC points to a new data space for fun3 (since fun3 cannot continue until fun2 is finished).

whenfun3When the execution is complete,fun2Get the return value, return to the original break point, continue to execute, calculate the result returned tofun1. whenfun1Wait tofun2It also continues to run…. from the break point

The entire operation process is executed in a stack space.

The so-called recursive function execution process is to set up an interrupt condition, let fun1 function continuously create a new fun1 function to calculate, until fun1 meets the interrupt condition, and then return the calculation results along the reverse process. This break condition determines how many times the recursive function can be executed. So if this condition is not set properly, the recursive function will continue in an infinite loop until the ever-expanding data space runs out of memory.

When a function completes, or returns a value, the function completes, and the program pointer falls back to where the last function that called the function left off.

If the argument to the recursive function is an object

If an object is passed to a recursive function and the recursion changes the members of the object, the situation changes a little bit. To illustrate the problem, Java code is used here.

I’ve declared a simple main function here, a recursive function, a simple class. If this main function is executed, the screen prints 0,1,2,3…. 10? Or 10, 9, 8, 7…. 0?

public static void main(String[] args) {
    Counter counter = new Counter();
    cycRef(counter);
}
static void cycRef(Counter counter) {

    if (counter.count < 10) {
        counter.count += 1;
        cycRef(counter);
    }
    System.out.println("count =" + counter.count);
}
class Counter {
    int count = 0;
}
Copy the code

But here’s the result: it prints 10 11 times.

Why is that? Most importantly, Java puts this object in the heap space, and the recursive function passes a copy of the heap address of the object referenced by this parameter (again, value passing). So in fact all 11 of these CycRef operations (which this CycRef performed 11 times, counting the main function calls) point to the same space in the Java heap. When layer 11 nested CycRef changes Counter. Count to 10, go back to 10, 9, 8…. Up to the outermost CycRef, Counter. Count is also 10, unchanged.

But suppose we recurse to a primitive datatype and run it:

public static void main(String[] args) {
    cycVal(0);
}
static void cycVal(int counter) {
    if (counter < 10) {
        cycVal(counter + 1);
    }
    System.out.println("count =" + counter);
}
Copy the code

We output 10,9,8,7….. 0. The reason is that since the parameters are basic data types, each call to CycVal passes a copy of the literal, which is passed to the next CycVal method to receive and modify. Therefore, each layer of CycVal will only modify its own counter value and output it, not affected by other CycVal methods.

Note: Scala’s AnyVal type (e.g. Int,Double) is also considered a primitive data type (although it falls under the category of a class, it is essentially a wrapper around Java primitive data types), so the above code rewrite runs in Scala and the result is the same.

Incidentally: Java parameter passing is all value passing. For details, please refer to this zhihu link:

The timing of using recursive functions

Just remember two basic elements of recursive functions:

  1. Critical conditions. For example: let the recursive function f(n). When n=a, there is an explicit return value, or the program terminates.
  2. Recursive equation. For example, f(n-1) = g(f(n)). Both sides of this equation have f(dot).

There is one more important condition: each recursion should be directed towards the recursion condition. We all know that the for loop is endless:

for(int i =1; i< 100; I -). Because I is going to get smaller and smaller, it’s just going to get further and further away from the end of the cycle.

If we can quickly mathematically summarize the critical conditions and recursive equations for a problem, then we can use recursion to solve it elegantly. However, it is difficult to generalize its critical conditions and recursive equations for most of the application problems we encounter at ordinary times. Therefore, code implemented using recursion is more obscure to the average person.

Simple exercise in Scala recursive functions

A few examples, as an exercise, are implemented using Scala code.

1. To calculate n! N factorial. Let f(n)= n * (n-1) * (n-2)… * (1).

Simple analysis shows that f(0) = F (1) = 1(factorial specifies that the factorial of 0 is 1, which is the critical condition); And there is a law of factorial, namely: f(n) = n* f(n-1) (recursive equation), the recursion direction is along the direction of n decrease. (Approximate critical condition)

//Scala allows special symbols for naming. Refer to Scala data format & Naming
/ / set!!!!! (n) = n * (n-1) * ... * 1
def !!!!!(n: Int) :Int = {
  //n=1,n=0 can be reduced to n<3
  if (n < 3) {
    1
  } else{
    / /!!!!! (n) = n * !! (n-1)n * !! (n- 1)}}Copy the code

2. Fibonacci sequence problem: There are many explanations and explanations about this sequence in the encyclopedia, which are omitted here.

We can get the following useful information: F (1) = 1, F (2) = 2 (critical condition), F (n) = F (n – 1) + F (n – 2) (recursive formula); The recursion direction is n from large to small recursion (approximate critical condition).

//F(n) = F(n-1) + F(n-2)
//F(1) = 1 ; F(2) = 2
def Fibonacci(int: Int) :Int = {
  // Returns 1 when int=1 and 2 when int=2, simplifying things here.
  if (int < 3) {
    int
  } else {
    Fibonacci(int - 1) + Fibonacci(int - 2)}}Copy the code

3. Monkeys eat peaches Problem: the first day the monkey picked a number of peaches, immediately eat half, not satisfied, and eat one more. The next morning he ate half of the remaining peaches and one more. Every morning I eat half and a half of what WAS left over from the day before. In the morning of the 10th day, when I wanted to eat again, THERE was only one peach left. How many peaches do you pick on the first day?

This is a word problem that abstracts the problem conditions into elements of a recursive function. The number of peaches is related to the number of days, so if the number of days is set to variable N and this relationship is set to f, the number of peaches per day can be expressed as F (x).

From the first sentence highlighted in grey we can infer:


F ( n + 1 ) = F ( n ) 2 + 1 F(n+1) = \frac{F(n)}{2} + 1

Or the other way around (we use it as a recursive equation) :


F ( n ) = 2 ( F ( n + 1 ) 1 ) F(n) = 2(F(n+1)-1)

We know from the problem: F(10) = 1, which means there is only one peach on day 10. Critical condition. The recursion direction is n from 1 to 10 recursively.

//f(n + 1) = f(n) / 2 - 1
//f(n) = 2(f(n+1) + 1)
def peach(int: Int) :Int = {
  if (int == 10) {
    1
  } else {
    2 * (peach(int + 1) + 1)}}//-----Main--------//
println(s"${peach(1)}")
Copy the code

The answer: 1,534.

The cardinal sin of recursive functions: double counting

Let’s look at the Fibonacci sequence again:

//F(n) = F(n-1) + F(n-2)
//F(1) = 1 ; F(2) = 2
def Fibonacci(int: Int) :Int = {
  // Returns 1 when int=1 and 2 when int=2, simplifying things here.
  if (int < 3) {
    int
  } else {
    Fibonacci(int - 1) + Fibonacci(int - 2)}}Copy the code

This solution is not efficient: the reason is that the following line of code contains a lot of repetitions:

Fibonacci(int - 1) + Fibonacci(int - 2)

The program will enter the first Fibonacci to recurse, and then enter the second Fibonacci to recurse. In fact, most of the computation in these two recursions is repeated. This results in an exponential increase in the complexity of the recursive function, and a lot of time is wasted on repeated calculations: when int=40, the program is already stuck in an infinite loop.

But recursively taking the factorial does not repeat itself, so even if n passes in a fairly large number (requiring the Int to be changed to BigInt), the program can recursively find the solution fairly quickly.

Interested students can implement the code themselves and compare the computation time required by Fibonacci(50) and factorial (50). At the same time, there are also a lot of big cattle for the Fibonacci problem optimization scheme, not in detail here.

Find the maximum value in the ListBuffer recursively

This example is a case that the author supplemented in the follow-up set learning. It will take me a while to update the Scala collection.

The recursive equation and the critical condition are not so easy to describe. So first of all, we need to sort out:

  • First of all, put the sequence inheadSet this parameter to the maximum value. And thentailSubset (tailRefers to removing the first elementheadAfter the rest of the elements are reconstituted in the set) the maximum value is compared. Returns if the first element is indeed the largesthead. In addition, we also use a recursive approach to searchtailMaximum in a subset.
  • ifheadIf it’s not the actual maximum, then thisheadGet rid of it, get rid of the resttailFind the maximum in the subset and repeat the first step.

In addition, we add the special case where the length of the list is 0 and 1, and give the following code:

def max(listBuffer: ListBuffer[Int) :Int= {if(listBuffer.isEmpty) ()

    if(listBuffer.size == 1){
      listBuffer.head
    }

    else if(listBuffer.head > max(listBuffer.tail))
    {
      listBuffer.head
    }else max(listBuffer.tail)
}
Copy the code