xkcd #2986: Every Scientific Field

schnurrito@discuss.tchncs.de · 5 months ago

xkcd #2986: Every Scientific Field

affiliate@lemmy.world · 5 months ago

How is it so hard?

I think a lot of the reason is that fields (the real numbers in this case) have some pretty lousy categorical properties, and you can’t define a very nice additive and multiplicative structure on ℝⁿ for n >3. So you end up having to deal with vector spaces instead of fields. i.e., you can’t (in general) multiply or divide points in ℝⁿ by other points in ℝⁿ, so you have way fewer tricks at your disposal. The other thing is that you don’t have a way to order points in ℝⁿ, so nice things like the mean value property sort of disappear. There are a few other complications as well, but I think those are the big ones. It’s a whole other beast than singlevariate analysis.

How is it that the more I’m “learning” for that damn math exam the less I know?

I feel like this is just an unfortunate part of learning math. I’m not really sure that feeling ever goes away, but it usually means you’re making progress. My experience has been that the more math I learn, the more comfortable I get with the things I already know, and the more I realize how much is left to learn. So it feels like I only really know the “basic stuff” and continue to struggle with the “hard stuff”. My advice would be to try to not be discouraged by it, although it’s easier said than done.

Why do I need it in the first place? Multivariate analysis is super useful in applications, especially for 3d rendering/modeling. It shows up a lot in video game/physics programming, and probably a bunch of other things too. It’s also foundational for more advanced things like tensor calculus/differential geometry/special relativity.

Why have exams at all? I’m going to be real with you, I’m completely on your side on this one. I think exams just cause a bunch of stress and that it would be better to just get rid of them. I never liked exams.

Onto the more technical questions. I’ll try to make things handwavey to hopefully make the “big picture” shine through a bit. I think analysis textbooks are a bit guilty of getting too wrapped up in the details and missing the forest for the trees (or however the saying goes).

What the hell is a total derivative, and why is it suddenly the same as a tangential plane?

The total derivative is basically just a way to turn calculus problems into linear algebra problems. I think it’s best understood by first looking at the one dimensional case, and then trying to generalize it a bit to higher dimensions. The key idea is this:

The derivative of a function f: ℝ -> ℝ at a point x₀ is the best way to approximate f with a straight line at the point x₀. This means that the linear equation y = f’(x₀) * x + f(x₀) is the most accurate approximation of f at x₀.

Notice how in the 1-dimensional case this is just a “clever” way to rephrase that f’(x₀) is the “instantaneous slope” of f at x₀.

In higher dimensions, it no longer makes sense to approximate f with a straight line, because lines are 1-dimensional objects, whereas the domain/codomain of the function might not necessarily be 1-dimensional. However, it does still make sense to talk about the best linear approximation of f. A bit of linear algebra knowledge helps to make this idea clearer, but I’ll try to do my best to explain it with as little linear algebra as I can. (But let me know if you want a more linear algebra heavy explanation.)

A higher dimensional linear function is (basically) just a matrix, and a matrix is basically just a way to (linearly) turn one vector into another vector. At a high level, you can think of a matrix as turning one copy of ℝ^m into another copy of ℝⁿ, possibly rotating/translating/scaling things in the process. (Compare this to the 1-dimensional case, where a 1 x 1 matrix is just a number, and multiplying by a a number “turns one copy of ℝ to another copy of ℝ”, provided that number isn’t 0.)

So, the total derivative is basically just a matrix that gives the best way to approximate a multivariable function f at a vector x₀. And as you vary the input vectors, you end up tracing out a copy of ℝⁿ for some n. i.e., you get an n-dimensional plane that corresponds to the “best” approximation for f. And “best approximation” is just a slightly less fancy way of saying “tangential”.

Why is the gradient just a collection of the first partial derivatives?

I always found the gradient to be a bit confusing. But I think it helps to understand it best in terms of what it does, and not in terms of how it’s defined. The “purpose” of the gradient is to let you compute the directional derivative. i.e., what is the derivative in the direction of a given vector v. So, lets use the notation

(∇f)(v) to denote the directional derivative of f, in the direction of v.

Let’s consider the 3-dimensional case and write v = a₁e₁ + a₂e₂ + a₃e₃ for basis vectors e_i and real numbers a_i.

Since “taking the derivative” is linear, we would expect to have

(∇f)(v) = (∇f)(a₁e₁ + a₂e₂ + a₃e₃) = a₁(∇f)(e₁) + a₂(∇f)(e₂) + a₃(∇f)(e₃).

In other words, we only need to compute the directional derivative of the basis vectors in order to figure out the gradient. That’s pretty nice! Also, the derivative of _f in the direction of e_i is exactly the partial derivative of f taken with respect to e_i. Let’s write f_i for the partial derivative with respect to e_i (just because I don’t know how well Lemmy handles double subscripts). Then we can rewrite the above equation as

(∇f)(v) = = a₁f₁ + a₂f₂ + a₃f₃.

Now compare that with the dot product of the vectors (f₁, f₂, f₃) and _v = (a₁, a₂, a₃). It’s exactly the same. So, the gradient can be defined in terms of taking the dot product of a vector with the partial derivatives. But I think that kind of loses a lot of the intuitive meaning of the gradient in the process.

I hope you found some of this helpful, and feel free to ask if you have any more questions/found something I said confusing.

PlexSheep@infosec.pub · 5 months ago

Thanks for answering my frustrated questions, was a long day yesterday. I’ll try to understand the deeper truths later, but I can already tell the matrix stuff goes over my head.