We now know that L must be L(v²) for a free particle, but we want to know the exact form of the function. For this, we are going to use Galileo's relativity principle.
Consider frame K moving at an infinitesimal velocity |ε> wrt another frame K'. If from K' we see a free particle moving with velocity |v>, we will see the same particle from K moving with |v'> = |v> + |ε>.
The equations of motion must be the same for K and K'. For K we have the Lagrangian L and for K' we have L'. But in order for L and L' to produce the same equations of motion, they can only differ by a term that is a total derivative of a function that can only depend on t and q.
Since |v'> = |v>+|ε>, we square the expression and get v'² = v² + ε² + 2 <v|ε>.
In L(v²) we can substitute now v² by v'², so that L'=L(v'²). We don't want the form of the function to change when changing between iframes, so it is clear that L' must be equal to L(v'²).
Now we can write L(v'²) = L(v²+ε²+2<v|ε>). Then, since ε is small compared to v, we can expand this function in powers of |ε>, taking only 1st order terms.
How do we expand such a function? If we have f(x) for small x, we know that, at first order,
f(x) ≃ f(0) + x·(df/dx)₀,
where ()₀ means (df/dx) for x=0. For f(|r>) and small |r>, f(|r>) ≃ f(|0>) + (<∇|f)₀|r> = f(|0>) + (∂f/∂|r>)₀·|r>.
So, in our case, we have L(v²+ε²+2<v|ε>). How do we proceed in this case? Since what is small is ε and since we are going to develop powers of ε, we have
L' ≃ L(ε=0) + (dL/d|ε>)₀·|ε>.
For f[g(x)], we would write
df[g(x)]/dx = (dg(x)/dx)·∂f/∂g.
In other words, differentiate wrt to the bulk of the function g and then perform the total derivative of the function g wrt x. For our case, we can write g(|ε>) = <ε|ε> + 2<v|ε> + <v|v>. Then,
dL[g(|ε>)]/d|ε> = (dg/d|ε>)·∂L/∂g.
But now we do a trick: instead of ∂L/∂g we write ∂L/∂v². It must be the same result, since v² in g only appears once and without prefactors. This is a nice way of making L be operated by the only variable we know it has as dependence, which is v². If you don't trust this, practise with the example f=sin(x+v²) and see how ∂f/∂(x+v²)=∂f/∂x=∂f/∂v²=cos(x+v²).
Also, dg/d|ε> = 2|v>+2|ε>. So we can write now
dL/d|ε> = (2|v>+2|ε>)·∂L/∂v².
The latter expression becomes 2|v>·∂L/∂v² for |ε> = |0>, as we are asked to do in the expansion.
We are ready to write the expansion at first order as
L' ≃ L(v²) + 2<ε|v>∂L/∂v².
Now we have the structure we wanted: L' = L + something. In order for L' to produce the same equations of motion than L, we need that "something" to be a total time derivative of a function f(t,|r>).
The term is 2<ε|v>∂L/∂v². Let's ask first about the first condition: to be a total time derivative. Recall that |v> = d|r>/dt and see how the term already has a factor |v>. We don't worry about 2<ε| since they are constant. But we worry about ∂L/∂v². Recall that we already know L=L(v²). L could be something like L=bv² and then ∂L/∂v² = b, which would be good, as then we would only have |v> as a total derivative of time. Is there another choice? If L=b(v²+v⁴), L is still a function of v², since L=bv²+b(v²)², but in this case, ∂L/∂v² gives b+2bv², and then our term cannot be expressed as a total time derivative. It is "clear" that the term can only contain a single linear |v> in order for the whole term to be a total time derivative.
Another way of seeing this is that our term is constants·(∂L/∂v²)·(d/dt)(|r>). And we want the d/dt to be at the very beginning of the term. The constants can clearly enter as (∂L/∂v²)·(d/dt)(constants·|r>). But we need (∂L/∂v²) to enter inside as well, and its only choice is to be constant. Conclusion:
L(v²) = constant·v² = b·v².
So now it is clear that if L=b·v², then L'=b·v'². So we can write, applying Galileo's relativity, and now with a relative speed |u> between K and K' that is not necessarily small,
L' = bv'² = b(v²+u²+2<u|v>) = bv² + bu² + 2b<u|v> = = L + bu² + 2b<u|v> = L + d/dt(bu²·t + 2b<u|r>).
As we can see, L' and L only differ by a total time derivative. So they will produce the same equations of motion.
What is b? From these arguments, we can only *define* it as b=m/2, with m being the mass of the free particle. Then,
L = m·v²/2.
If we have a set of non-interacting particles, the total L is
L = ∑ mₐ·vₐ² / 2.
So the scale that we choose for mass is not entirely arbitrary. We can change the scale of mass by a global prefactor and that's OK, but we cannot arbitrarily change the scale of each individual particle.
In other words: the *ratio of the masses* between different particles is not an arbitrary quantity by any means. These ratios are physically meaningful!
If we want S to be a minimum for the actual path, then m must be positive. If m would be minimum, then we could always imagine a path for which S is smaller and smaller (deeper and deeper in the negative domain of numbers, we mean).
It is very useful to write dl²=dx²+dy²+dz², so that
v² = (dl/dt)² = dl² / dt²,
because then, the Lagrangian can be written as
L = (m/2)·dl²/dt².
Why is this useful? Because many times we don't work with Cartesian coordinates, in which dl²=dx²+dy²+dz², and it is how we write dl² in our new coordinates what determines how L looks like.
For Cartesian coordinates, we get
L = (m/2)·(ẋ² + ẏ² + ż²).
For cylindrical coordinates, dl² is not so easy. We have (x,y) being polar and z being 3rd coordinate. This means that (x,y)=r(cosϕ,sinϕ) and then (dx,dy) = dr(cosϕ,sinϕ) + rdϕ(-sinϕ,cosϕ) and then dx²+dy² = dr² + r²dϕ². We finally have dl² = r²dϕ²+z², and then,
L = (m/2)·(̇[r·]² + r²[ϕ·]² + z²).
Unfortunately, we cannot place a dot above ϕ, so we use [ϕ·] instead. For r we can write ṙ but for consistency we use [r·].
For spherical coordinates,
(x,y,z) = r(cosϕsinθ,sinϕsinθ,cosθ) dx = dr·cosϕsinθ + r·dθ·cosϕcosθ - r·dϕ·sinϕsinθ dy = dr·sinϕsinθ + r·dθ·sinϕcosθ + r·dϕ·cosϕsinθ dz = -dr·cosθ - r·dθ·sinθ dl² = dr² + r²dθ² + r²sin²θdϕ²,
so the Lagrangian is
L = (m/2)·([r·]² + r²[θ·]² + r²sin²θ[ϕ·]².