This article originated from personal public account: TechFlow, original is not easy, for attention


In today’s article, we are going to talk about the famous Taylor’s formula, which is really famous, and I’m sure those of you who took high math will remember its name. Even if you skipped all of your classes, you’re bound to see it in the pre-test highlights.

My first impression of it was that it was difficult and didn’t feel very interesting, just an approximation function. I’ve been going over it recently and I’ve come up with some new ideas that I hope to explain to the best of my ability.


Taylor’s formula


Before we look at the specific formulas and proofs, let’s look at what it does, and then with that in mind it’s easier to think about the context and how it works. This is also the rule that I have learned by myself for so long.

Taylor’s formula essentially solves approximation problems, so if we have a seemingly complicated equation, it might be very difficult for us to directly evaluate the equation itself. So we want to be able to figure out a way to approximate it enough to get an approximation.

From this, we get two important points, one is approximate method, the other is approximate precision. We need to find suitable methods to approximate, and at the same time, we need to ensure that the accuracy of the approximation is controllable. Otherwise, everything is meaningless. It’s actually very understandable when combined with reality. For example, we use machine tools to build a part. We all know that there is no perfect circle in the world, and in fact we don’t need to be perfect, but we need to make sure that the deviations are manageable and within a certain range. Taylor’s formula is the same thing, it can help us make the approximation, but it can also guarantee that the result is accurate enough.


Taylor’s formula


Now let’s look at the definition of Taylor’s formula, which we already know is used to approximate a function. But how do we figure that out? Well, a very naive way to think about it is by slope approximation.

Here’s an example:

This is a classic derivative diagram, and as we can see from the diagram above, with respect to sigmaPhi decreases by phiandAnd it’s getting closer and closer, and that brings usGetting closer and closer.

Of course, whenWhen it’s large, obviously the error is going to be large, and to reduce the error, we can introduce the second derivative, the third derivative, and the higher derivative. Since we don’t know exactly how many derivatives the function can have, we might as well assume that f(x) always has (n+1) derivatives in the interval, and try to write a polynomial to approximate the original function:


We want this to be different from the original valuethe smaller the betterHow small is good enough? Mathematicians have defined it and want it to beOf higher order infinitesimal. That is to say, the error ratioThe limit of theta is 0.

As we said, we’re approximating by derivatives, so let’s assume:


It’s easy to get the coefficients from this assumption, but it’s really easy, we construct the coefficients so that the constant terms cancel out when we take the derivatives.


If we substitute these two expressions, we get:



Taylor’s proof


So this is what Taylor’s formula is all about, which is that we’re approximating the function by higher derivatives. And then we just have to prove that this is what we want, that the error is small enough.

Let’s also use a functionTo represent theWith the functionThe difference between the values. It’s hard for us to make direct comparisons, so mathematicians resort to a series of fancy, spectacular operations.

If we plug it in,Not only that,.

We don’t have to prove this at all, we can just plug in and take the derivative. Because there isThe term of theta, obviously whenThe above conclusion can be drawn.

Here, we need to make a guess, and there’s a little bit of a jump in the steps. Even the textbook doesn’t explain it in detail, and the reason it doesn’t explain it in detail is simple, because you need to know about integrals. And the reader here has not been exposed to integrals, but we are not a rigorous paper, so we can relax a little bit. Actually, we can make some guesses based on the above formula. According to the above rule, and our goal — prove thisThe function is a function of thetaIs infinitesimal, so we can guess that it should be an andRelated functions.

With this guess in mind, let’s apply cauchy’s mean value theorem:


We make, the mean value theorem can be applied to obtain:


So with that, let’s look at the delta functionandIn the intervalCauchy mean value theorem is applied again to:


Next comes the familiar nesting link. After n+1 nesting, we can get:


We have toIf you take the derivative of n plus 1, you get 0, because all of these terms are at most n times, and you take the derivative of n plus 1 and they all become 0. That is to say,, so, we can substitute this into the above equation, and get:



Prove the error


And then we’re going to prove that errorisOf higher order infinitesimal.

So now, the proof is pretty straightforward, that on some fixed interval (a, b), it’s pretty obvious that the delta functionThere is a maximumLet’s say the maximum is M. That is to say,

So:


Because of x approximationM is a constant, so this limit approaches 0, which we can easily prove by the definition of a limit. So we proved the errorIs better thanInfinitesimal of higher order.

So we can get:


And since we’ve always used the NTH derivative in terms of the antiderivative, we call this the antiderivative of f of xTaylor expansion of order N. The last of the, we call it the Lagrange remainder. We could also write it as thetaIt’s called a Payano cofactor, and it’s the same thing as a Lagrange cofactor, just written differently.

If we makeAnd then we could simplify this even further. Due to theIt’s in between 0 and x, so we can make theta, the original formula can be written as:


This is a much simpler formula than the one above, and it also has a name calledMaclaurin’s formula. The Payano remainder under MacLaurin’s formula is written“Seems pretty simple.

If you think this is a little bit too much to remember you can ignore it, just remember MacLaurin’s formula. For the Lagrange remainder term, we will only use it when calculating the error, and we can also ignore it when the error does not need to be considered.


For example,


Let’s look at a practical example to get a feel for the power of Taylor’s formula.

We all know that there are some functions that are hard to evaluate directly, like, and sines and cosines. Since e is an irrational number, have you ever wondered how we can evaluate a function with e? In fact, a lot of times, it’s Taylor’s formula.

We will useSo, for example, how do you do it using Taylor’s formula.

To simplify the calculation, we obviously consider MacLaurin’s formula. Due to theWhen,And,.

So we can get:


If we substitute Taylor’s formula, we can get:


If we take the last term as an error, we can get:


When n=10, x=1, the error generated is:


We can do a little bit of math and see that this error is less thanClose enough. That is to say, if we take a function that’s not very easy to compute and turn it into a sum of polynomials, it’s very easy to get a close enough approximation. And in addition to that, we can figure it outMaximum errorIt’s perfect.


thinking


I’m not done here, but after all the derivation and calculation, do you have any questions, how did Taylor come up with such a wonderful, complex and useful formula? It’s hard to explain in a moment of inspiration. After all, people’s inspiration is often a flash of insight to a certain point, and so many formulas and conclusions are hard to get insight.

I was completely unaware of this problem when I was in school, but it didn’t feel right when I revisited it. Of course you might say that there are so many names of mathematicians here, it’s obviously not the work of one person. But even so, I still wonder what causes such a great formula.

It was not until I accidentally saw the answer of the great God of Sahuan zhihu that I suddenly realized.

If f(x) is equal to g(x), then obviously the derivatives of f(x) and g(x) are all equal. So the question is, if we artificially construct a function h(x) so that its derivatives agree with g(x), can we assume that this artificially constructed function is also equal to g(x)?

However, the higher derivatives of some functions are infinite, and we cannot fit all of them manually, so we have to settle for the next best thing, fitting n of them. Obviously there’s an error, so we need to know how big the error is. And then there’s the Lagrange remainder.

The emergence and derivation of Taylor’s formula is based on this idea, which gives me new ideas. If the item as the derivative, so this problem is transformed into regression problems of machine learning, but we are in the middle of the machine learning setting the optimization goal and optimization method, let model trained by fitting approximation, and Taylor formula is calculated the results through the power of thinking and mathematical, both the purpose and the result is the same, But the process is completely different. Two seemingly unrelated problems come to the same destination. It has to be said that the charm of mathematics is really impressive.

That’s all for today’s article. If you feel you have gained something, please scan the code and pay attention to it. Your contribution is very important to me.