• PlexSheep@infosec.pub
    link
    fedilink
    English
    arrow-up
    0
    ·
    edit-2
    2 months ago

    I’m not chilling. Second try on multivariate analysis in a week. I don’t want to fail.

    (Yes I’m procrastinating by writing this comment)

    • affiliate@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      2 months ago

      that’s fair. i remember multivariate being a bit rough back in the day. i feel like a lot of the difficulty with it might be due to how many shortcuts are taken when explaining things in singlevariate analysis, since a fair number of core concepts and tools don’t translate super well into the multivariate case.

      i think the worst offender is the idea of the derivative as “the slope”, since that makes it quite hard to guess what the multidimensional derivative should be, and it makes the notions of gradient and partial derivatives a bit suspect. but some of the ways they teach integration in singlevariate analysis also don’t translate super well.

      i feel like with calculus a bit part of the difficulty is in building up the intuition about how things work and what things mean, but my experience has been that that’s not a huge part of calculus courses. knowing some of the history about differentials and infinitesimals can also help a bit too, since that’s how calculus was first done, and it helps to understand the notation as well.

      i hope some of this helps, and feel free to ask if you have any questions about some of the concepts

      • PlexSheep@infosec.pub
        link
        fedilink
        English
        arrow-up
        0
        ·
        2 months ago

        It should be easy, it’s just analysis but with an added dimension, basically. How is it so hard? How is it that the more I’m “learning” for that damn math exam the less I know? Why do I need it in the first place? Why have exams at all? I know what I know, and it’s not like I’m learning anything by preparing for them. I hate exams so much, it’s so stressful.

        I doubt you have the answers to that, even if you did, they wouldn’t really help. So let’s ask something useful, since you’re offering.

        What the hell is a total derivative, and why is it suddenly the same as a tangential plane?

        Why is the gradient just a collection of the first partial derivatives? How’s a tuple of them any useful? Apparently it’s showing the direction of steepest ascend or something? I don’t get it.

        • affiliate@lemmy.world
          link
          fedilink
          English
          arrow-up
          0
          ·
          2 months ago

          How is it so hard?

          I think a lot of the reason is that fields (the real numbers in this case) have some pretty lousy categorical properties, and you can’t define a very nice additive and multiplicative structure on ℝn for n >3. So you end up having to deal with vector spaces instead of fields. i.e., you can’t (in general) multiply or divide points in ℝn by other points in ℝn, so you have way fewer tricks at your disposal. The other thing is that you don’t have a way to order points in ℝn, so nice things like the mean value property sort of disappear. There are a few other complications as well, but I think those are the big ones. It’s a whole other beast than singlevariate analysis.

          How is it that the more I’m “learning” for that damn math exam the less I know?

          I feel like this is just an unfortunate part of learning math. I’m not really sure that feeling ever goes away, but it usually means you’re making progress. My experience has been that the more math I learn, the more comfortable I get with the things I already know, and the more I realize how much is left to learn. So it feels like I only really know the “basic stuff” and continue to struggle with the “hard stuff”. My advice would be to try to not be discouraged by it, although it’s easier said than done.

          Why do I need it in the first place? Multivariate analysis is super useful in applications, especially for 3d rendering/modeling. It shows up a lot in video game/physics programming, and probably a bunch of other things too. It’s also foundational for more advanced things like tensor calculus/differential geometry/special relativity.

          Why have exams at all? I’m going to be real with you, I’m completely on your side on this one. I think exams just cause a bunch of stress and that it would be better to just get rid of them. I never liked exams.

          Onto the more technical questions. I’ll try to make things handwavey to hopefully make the “big picture” shine through a bit. I think analysis textbooks are a bit guilty of getting too wrapped up in the details and missing the forest for the trees (or however the saying goes).

          What the hell is a total derivative, and why is it suddenly the same as a tangential plane?

          The total derivative is basically just a way to turn calculus problems into linear algebra problems. I think it’s best understood by first looking at the one dimensional case, and then trying to generalize it a bit to higher dimensions. The key idea is this:

          The derivative of a function f: ℝ -> ℝ at a point x0 is the best way to approximate f with a straight line at the point x0. This means that the linear equation y = f’(x0) * x + f(x0) is the most accurate approximation of f at x0.

          Notice how in the 1-dimensional case this is just a “clever” way to rephrase that f’(x0) is the “instantaneous slope” of f at x0.

          In higher dimensions, it no longer makes sense to approximate f with a straight line, because lines are 1-dimensional objects, whereas the domain/codomain of the function might not necessarily be 1-dimensional. However, it does still make sense to talk about the best linear approximation of f. A bit of linear algebra knowledge helps to make this idea clearer, but I’ll try to do my best to explain it with as little linear algebra as I can. (But let me know if you want a more linear algebra heavy explanation.)

          A higher dimensional linear function is (basically) just a matrix, and a matrix is basically just a way to (linearly) turn one vector into another vector. At a high level, you can think of a matrix as turning one copy of ℝm into another copy of ℝn, possibly rotating/translating/scaling things in the process. (Compare this to the 1-dimensional case, where a 1 x 1 matrix is just a number, and multiplying by a a number “turns one copy of ℝ to another copy of ℝ”, provided that number isn’t 0.)

          So, the total derivative is basically just a matrix that gives the best way to approximate a multivariable function f at a vector x0. And as you vary the input vectors, you end up tracing out a copy of n for some n. i.e., you get an n-dimensional plane that corresponds to the “best” approximation for f. And “best approximation” is just a slightly less fancy way of saying “tangential”.

          Why is the gradient just a collection of the first partial derivatives?

          I always found the gradient to be a bit confusing. But I think it helps to understand it best in terms of what it does, and not in terms of how it’s defined. The “purpose” of the gradient is to let you compute the directional derivative. i.e., what is the derivative in the direction of a given vector v. So, lets use the notation

          (∇f)(v) to denote the directional derivative of f, in the direction of v.

          Let’s consider the 3-dimensional case and write v = a1e1 + a2e2 + a3e3 for basis vectors ei and real numbers ai.

          Since “taking the derivative” is linear, we would expect to have

          (∇f)(v) = (∇f)(a1e1 + a2e2 + a3e3) = a1(∇f)(e1) + a2(∇f)(e2) + a3(∇f)(e3).

          In other words, we only need to compute the directional derivative of the basis vectors in order to figure out the gradient. That’s pretty nice! Also, the derivative of f in the direction of ei is exactly the partial derivative of f taken with respect to ei. Let’s write fi for the partial derivative with respect to ei (just because I don’t know how well Lemmy handles double subscripts). Then we can rewrite the above equation as

          (∇f)(v) = = a1f1 + a2f2 + a3f3.

          Now compare that with the dot product of the vectors (f1, f2, f3) and _v = (a1, a2, a3). It’s exactly the same. So, the gradient can be defined in terms of taking the dot product of a vector with the partial derivatives. But I think that kind of loses a lot of the intuitive meaning of the gradient in the process.

          I hope you found some of this helpful, and feel free to ask if you have any more questions/found something I said confusing.

          • PlexSheep@infosec.pub
            link
            fedilink
            English
            arrow-up
            0
            ·
            2 months ago

            Thanks for answering my frustrated questions, was a long day yesterday. I’ll try to understand the deeper truths later, but I can already tell the matrix stuff goes over my head.