6. Four-vectors.

Four-vectors.

Here we enter into a really challenging zone, not only because of the physics and mathematics: it will also challenge the attempt of writing it with plain text. I am not sure about the outcome of this.

The four-dimensional position vector will be (ct,x,y,z), or alternatively, (x⁰,x¹,x²,x³). Or even more compressed, just xⁱ where i runs from 0 up to 3.

The "square" or "length" of this "vector" is (x⁰)² - (x¹)² - (x²)² - (x³)², and it does not change under Lorentz "rotations".

But we don't need the vector to be a position vector. Any (A⁰,A¹,A²,A³) that behaves the same way will be a four-vector and will follow the same Lorentz transf (here a change from K' to K with relative speed u along x):

A⁰ = (A⁰' + (u/c)A¹') / √ 
A¹ = (A¹' + (u/c)A⁰') / √
A² = A²'; A³ = A³'

where we use √ as a shorthand for (1 - (u/c)²)¹⸍² from now on. Or √{u} if we want to specify the value of the speed inside. This expression is so common in relativity that it is meaningless writing it again and again without some shortcut.

For this generic 4-v (four-vector shorthand), we have (A⁰)² - (A¹)² - (A²)² - (A³)² = invariant under Lorentz transf.

The superscripts for a 4-v are called *contravariant* components, and if we write the 4-v with subscripts then we call the components as *covariant*. The contravariant version of a 4-v is Aⁱ, and its covariant version is Aᵢ.

The reason to distinguish them is signs:

A₀ =  A⁰
A₁ = -A¹
A₂ = -A²
A₃ = -A³.

Spatial components change sign when we change subscript to superscript and vice versa. The time, 0 component does not change sign.

For Lorentz transf, contravariant and covariant components have opposite sign in the numerators.

In Euclidean geometry we can write the square of the magnitude of a vector |A> as

∑[AⁱAᵢ;i=0;3] = A²,

where we notice the notation for the sum:

∑[xpression;index = initial value;final value].

So in our new geometry we can write the magnitude of the 4-v A as

∑[AⁱAᵢ;i=0;4] = A⁰A₀ + A¹A₁ + A²A₂ + A³A₃ = AⁱAᵢ.

Notice the last step: we omit the ∑ sign when the same index is repeated at different positions. One position must be a superscript and the other a subscript. If both are subscripts or both superscripts then we don't sum over them. When we sum over them, the repeated index is called a dummy index. This is known as the Einstein convention.

In contrast to most modern books, the authors take Latin letters i,j,k etc for 4-d indices, and Greek letters when we want to refer only to the 3d xyz coordinates of space.

This is completely related to the *scalar* or *dot* product of two vectors, which for 4-v is

AⁱBᵢ = A⁰B₀ + A¹B₁ + A²B₂ + A³B₃ = AᵢAⁱ.

Mind the last step, which gives the same result!

The product AⁱBᵢ is a four-scalar. This means that it is a scalar (a number) that is invariant under Lorentz transf.

A⁰ is the time component of Aⁱ, and A¹,A²,A³ are the space components.

The square of a 4-v can be positive, negative or zero, something that would never happen in Euclidean geometry. If it is positive, the vector is timelike. If negative, spacelike. If zero, it is a null-vector or an isotropic vector.

In Euclidean geometry we write vectors as |A> = (A¹,A²,A³), and clearly the space components form an Euclidean 3d vector.

We can write Aⁱ as Aⁱ = (A⁰,|A>). Also, Aᵢ = (A⁰,-|A>).

All this fuzz about contravariant and covariant indices is a consequence of the weird signs of this geometry. In Euclidean geometry we don't bother about such distinctions, so it does not matter if you write indices as subscripts or superscripts. In this book, when we write equations for only-space coordinates, we use Greek indices and usually write only subscripts. We apply Einstein's rule for summing over dummy indices, which now can both be subscripts!

|A>·|B> = AᵦBᵦ.

Four-tensors.

A scalar does not need an index, like A. We say it is a tensor of rank 0.

A vector needs a single index, like Aⁱ. We say it is a tensor of rank 1.

An object like Aⁱʲ is called a tensor of rank 2.

Clearly, the rank of an object is the number of indices it has. We can have tensors of any rank.

In 4d, Aⁱʲ is a collection of 16 quantities. These quantities transform as the products of components of two four-vectors. This means we can always write

Aⁱʲ = BⁱCʲ.

We can write tensors of second rank in several ways. The covariant form is Aᵢⱼ. The contravariant form is Aⁱʲ. And there are two mixed forms: Aᵢʲ and Aⁱⱼ. What is important in mixed forms is which index goes at the top and which goes at the bottom. Notice that it does not matter if we write Aᵢʲ or Aʲᵢ, and we also have Aⁱⱼ=Aⱼⁱ. With a pencil, you could write the indices vertically, but better if you don't.

A rule for signs: if you raise or lower a spatial index, the tensor component changes sign, but if you raise or lower a time component, the sign does not change. If you perform several operations, multiply the signs:

A₀₀=A⁰⁰, A₀₁=-A⁰¹, A₁₁=A¹¹
A₀⁰=A⁰⁰, A₀¹=A⁰¹, A₁⁰=-A¹⁰, A₁¹=-A¹¹ .

The component A⁰⁰ is a three-dimensional scalar.

The components A⁰¹, A⁰², A⁰³ and also A¹⁰, A²⁰, A³⁰ are three-dimensional vectors.

The nine quantities A¹¹, A¹², etc, under purely spatial transformations, form a three-tensor.

A tensor is symmetric if Aⁱʲ=Aʲⁱ, and antisymmetric if Aⁱʲ=-Aʲⁱ.

If antisymmetric (also called skew-symmetric), the diagonal components must be 0. It is the only way we can have A⁰⁰=-A₀₀. It is like having x = -x, which is only true for x=0.

In a symmetric tensor, the mixed quantities Aᵢʲ and Aⁱⱼ are equal. It is then that you can write them vertically if you use pencil and paper!

We can have tensor equations! In them, both sides must have identical and identically placed *free indices*. A free index is an index which is not dummy.

If you have a tensor equation, you can choose an index, like i, and flip its position (from top to bottom or from bottom to top). But then, you must flip all instances of the same index i in the whole equation.

If you have a tensor expression (an expression has only one side, while an equation has two sides), you can raise and lower free indices, but then the quality of the expression will change. For example, Aⁱⁱ are tensor components, but if you lower one i, as in Aⁱᵢ, we have a scalar now! Aⁱᵢ= A⁰₀ + A¹₁ + A²₂ + A³₃. This is called the *trace* of the tensor, and the operation to obtain it is called *contraction*.

Another contraction can be having a tensor AⁱBⱼ (notice it is a tensor formed from two vectors!) and then writing AⁱBᵢ, which becomes its contraction.

Contracting two indices lowers the rank of the tensor by 2.

Aⁱⱼₖₕ is a tensor of rank 4, since there are four free indices: ijkh. We can contract 2 of them now.

Aⁱⱼₖᵢ is a tensor of rank 2, since it has only two free indices, j and k.

AⁱₖBʲ is a tensor of rank 3. We can contract it as AⁱₖBᵏ, which has rank 1 (= a four-vector).

Aⁱʲₖₕ is a tensor of rank 4. It can be contracted to a scalar if we do Aⁱʲᵢⱼ.

An important singular tensor is the unit four-tensor δᵢᵏ. Since it is symmetric, with paper and pencil you would write both indices on top of each other.

This tensor usually has the task of changing indices of another tensor. It just changes the letter, but does not raise or lower indices.

δᵢᵏAⁱ = Aᵏ ---> this changes the i by a k in A.

The components of δ are equal to 1 if i=k and 0 if i≠k.

Its trace is built by contraction: δᵢⁱ = 4.

Now we can take δ, which always has one index down and the other up, and raise or lower one of them. For example, if we have δᵢᵏ we can raise the i and get gⁱᵏ or lower the k and get gᵢₖ. Notice that when both indices are either up or down we don't call them δ but g. We call g the *metric tensor*.

We can write the metric tensor as a matrix:

              1   0   0   0

              0  -1   0   0
(gᵢₖ)=(gⁱᵏ) = 
              0   0  -1   0
  
              0   0   0  -1

If written as a matrix, the first index labels the rows and the second the columns.

The task of δ was to change the letter of an index without raising or lowering it.

The task of g is to change the letter of an index with also raising or lowering it.

You can think g is pulling an index to the place where it has its own indices.

gᵢₖAᵏ = Aᵢ ----> index lowering. Here, g pulls the index down.

gⁱᵏAₖ = Aⁱ ----> index raising. Here, g pulls the index up.

The scalar product of two 4-v can be written in the (extremely important) forms:

AⁱAᵢ = gᵢₖAⁱAᵏ = gⁱᵏAᵢAₖ.

The tensors δᵢᵏ, gᵢₖ, gⁱᵏ are special: their components are the same in all coordinate systems. This does not happen for other tensors, for which they have different components in different coordinate systems.

For a tensor, don't think of it in terms of its components, since the components change with the reference system. It's like the components of a vector: its components change as we change the reference system. However, the vector itself, as an arrow in a paper, will stay the same! It is the same with tensors.

However, for δ and g, since the components are always the same, you can think them by their components if you like.

There is another very special tensor with this same property of components being the same for all reference systems. It is called the *completely antisymmetric unit tensor* εʰⁱʲᵏ. It has rank 4.

This tensor changes sign when you take any two indices and exchange their positions. Its components are either 0, +1 or -1.

If two indices are the same, the component is 0.

The only non-zero components need to have four different indices.

We set ε⁰¹²³ = 1. And this implies ε₀₁₂₃ = -1, since to lower the four indices we need to use four g components, three of them being -1 and one of them +1.

To know whether other components are +1 or -1, try to see whether the index arrangement can be obtained from 0123 by an even or odd number of transpositions. If even, we get +1. If odd, -4. For example, ε⁰¹³² = -1.

The number of permutations of 4 different letters is 4! = 24. This mean that we can fully contract it as

εʰⁱʲᵏεₕᵢⱼₖ = -24.

Let's try to understand now that ε is not strictly a tensor but a *pseudotensor*.

We can have true scalars and pseudoscalars. We can have true vectors and pseudovectors. And the same for tensors.

Coordinates of ε do not change with changes of reference. If we change the sign of 1 or 3 of the coordinates, for example, doing x --> -x or doing (x,y,t) ---> (-x,-y,-t), some of the components should change sign.

A pseudotensor behaves as a tensor for all coordinate transformations except those that cannot be reduced to rotations. For example, they do not behave as tensors if we do a reflection, which cannot be reduced to a rotation.

The tensor of rank 8 εʰⁱʲᵏεᵃᵇᶜᵈ is a true tensor, though, since any change of sign produced by a reflection would produce (-1)(-1)=1.

This 8-rank tensor can be contracted to a rank 6, 4 and 2. All these tensors have the same form in all coordinate systems. So they can all be expressed by combinations of δ tensors, since δ is the only true tensor keeping all coordinates invariant under coordinate transformations, even if they are reflections.

If Aⁱᵏ is antisymmetric, we can create its *dual*, which will be the pseudotensor A*ⁱᵏ = (1/2)·εⁱᵏᵃʰAₐₕ. Notice how this contraction has rank 2 as well. Both tensors are dual to one another. Rank 2 dual to rank 2.

Given a 1-rank tensor Aⁱ we can also find its dual as the pseudotensor εⁱᵏᵃʰAₕ. Rank 3 dual to rank 1.

The contraction AⁱᵏA*ᵢₖ of dual tensors is a pseudoscalar.

There are analogous properties between 3d vectors and tensors.

The completely antisymmetric unit pseudotensor of rank 3 is εᵅᵝᵞ (In 3d we don't need to distinguish between subscripts and superscripts and we use Greek indices. If I use superscripts [against the book's choice for subscripts it is because there is more availability of Greek superscripts in Unicode.) The sign of this ε changes when we transpose any pair of indices. The only non-zero components are those in which the three letters are different. We set ε¹²³ = 2 and then we take +1 for an even number of transpositions and -1 for an odd number of transpositions (wrt the initial configuration 123).

Any completely antisymmetric tensor of rank equal to the number of dimensions of the space in which it is defined will be invariant under rotations of the frame in this space.

Again, our 3d ε is a pseudotensor because in reflections, components will change sign. But we can build a true tensor (of rank 6) as εᵅᵝᵞεᵟᶿᵠ. Therefore, it can be expressed as combinations of components of the unit three-tensor δᵅᵝ.

An ordinary true vector also changes sign in its components if we do a reflection of the coordinates, which means changing x by -x, y by -y and z by -z. Such a true vector is also called *polar*.

A vector that is the result of the cross product of other two vectors is called *axial*, and it does not change sign upon reflection. Both of its factors, usually polar, will change sign, but the product will not get this change of sign. An axial vector is a pseudovector. For a vector, pseudo means not changing sign under coordinate inversion.

A scalar product produces a scalar from two vectors. If we do a scalar product between an axial and a polar vector, the result is a pseudoscalar, since it changes sign under a coordinate inversion. For a scalar, pseudo means changing sign under coordinate inversion.

Recall how a pseudovector is dual to an antisymmetric tensor. So, if our pseudovector is |C> = |A> ⨯ |B>, then the vector |C> can also be written as a contraction of its dual tensor, Cᵅ = (1/2)·εᵅᵝᵞCᵝᵞ, where Cᵝᵞ = AᵝBᵞ-AᵞBᵝ.

This is extremely important in physics. For example, in electromagnetism (EM) we will find that the magnetic field can be written either as a vector or as an antisymmetric matrix. In spatial rotations, we can also write the angular velocity as a vector or as antisymmetric matrix. The same happens for all axial vectors, as the angular momentum, momentum of force, etc.

Let's move on to consider four-tensors Aⁱᵏ, which are of rank 2. If it is antisymmetric, then its space components (1,2,3) form a 3d antisymmetric tensor corresponding to purely spatial transformations. This spatial tensor could also be written as an axial vector |a> = (ax,ay,az).

The components that involve the time component, A⁰¹,A⁰²,A⁰³ for a 3d polar vector |p> = (px,py,pz). In matrix terms, we can build how these 6 components are located in a 4⨯4 arrangement:

    (    0       px      py      pz   )
    (                                 )
    (   -px      0      -az      ay   )
    (                                 )
    (   -py      az      0      -ax   )
    (                                 )
    (   -pz     -ay      ax      0    )

It is worth here to be able to find the interesting structures: a 3⨯3 matrix with the axial vector and two instances of the polar vector, one as a row and another as a column, with opposite signs:

              +--------------------+ 
    (   0     | px      py      pz |  )
    (  +----+ +--------------------+  )
    (  |-px | | 0       -az     ay |  )
    (  |    | |                    |  )
    (  |-py | | az      0       -ax|  )
    (  |    | |                    |  )
    (  |-pz | | -ay     ax      0  |  )
       +----+ +--------------------+

So, when having an antisymmetric four-tensor (4-T), we could write Aⁱᵏ = ( |p>, |a> ) for the contravariant version. For the covariant version, Aᵢₖ = ( -|p>, |a> ). Notice how by going from contravariant to covariant we show whether a vector is true or pseudo. Since |p> is true, it changes sign, while |a>, being pseudo, remains the same.

vector calculus in 4d.

You may have noticed that with Unicode we cannot put little arrows to letters so that they can be understood as vectors. We cannot use bold fonts either. So for a vector I am going to use a bra-ket or Dirac notation. This means that a regular vector will be written as |a> = (ax,ay,az), where |a> is the ket form.

However, it is quite convenient to have a notation that can distinguish between |a> and <a|, since, when moving to more complex and difficult topics, |a> will be a vector and <a| will be a *form*. Or, in our present subject, some objects will be contravariant and others will be covariant. And we have seen how, in order to form a contraction, like a scalar product, we need both objects to be of opposite nature, otherwise their indices cannot be considered as dummy.

So AⁱAᵢ = AᵢAⁱ = <A|A> = Ax² + Ay² + Az² = A².

In 3d, the gradient of a scalar ϕ uses the vector symbol of a nabla ∇. Should we write |∇> or <∇| ? This is an interesting question! If, instead of the gradient of a scalar we want to perform the divergence of a vector field |A>, it is quite clear that we are going to write <∇|A>, where <∇| acts as a form. With components, we have

<∇|A> = ∂Ax/∂x + ∂Ay/∂y + ∂Az/∂z = ∂₁A₁ + ∂₂A₂ + ∂₃A₃.

What about the gradient? We need to apply the nabla operator to a scalar and obtain a vector field. But is this vector field a true vector or is it a form? It happens to be a form, so we write it as <∇|ϕ. If you are dealing only with 3d, then we insist: there is no need to distinguish between bra's (forms) and ket's (vectors). You just place an arrow to every vector and that's that.

You may wonder why a vector field is a form instead of a vector. The reason comes from the concept of total derivative. If we have a scalar field f and we perform the total derivative df, we must choose the direction along which we want to perform the total derivative. Imagine the scalar field in 2d and take a point in which we know the value of the field. This point has a neighbourhood, a ring of neighbour points to choose. A total derivative of the field f is how f changes from our central initial point to a final neighbour point. But depending on the direction (the neighbour) that we choose, the value of the total derivative is something different, in general.

The concept of gradient tells us along which direction the change of the field has the maximum value (in absolute terms). So <∇|f gives a direction for which the change of f wrt its surroundings is maximum. This maximum change can be thought as a vector, and then, in order to know the total change of the scalar field along a given direction, we simply project the previous vector, the gradient, along the desired direction.

The gradient is <∇|f, and the direction that we choose is |dr>, where |r> is the vector position that we are considering. Then, the total derivative is df. We insist: asking for df is a meaningless thing if we don't specify the direction |dr> for which we want to evaluate the change. Then, once |dr> is given, the total derivative *along* the direction |dr> is

df = <∇|f|dr>.

Conclusion for what matters here: the gradient of a scalar field is a form, or a bra.

There is also a relation to covariant and contravariant here. We can associate contravariant-vector-ket and covariant-form-bra.

The four gradient of ϕ is

∂ϕ/∂xⁱ = (∂ϕ/(c∂t), <∇|ϕ).

The total derivative of ϕ along dxⁱ is dϕ = (∂ϕ/∂xⁱ)·dxⁱ (a form followed by a vector).

The operators of differentiation are considered as covariant.

The divergence of a four-vector can be written as ∂Aⁱ/∂xⁱ, which is a true scalar.

But we could do the opposite: to use the contravariant differentiation operators against the covariant components. Then, it is the differentiation operator which becomes the vector/ket, while the components in which it operates are those of a form/bra:

∂ϕ/∂xᵢ = (∂ϕ/(c∂t), -|∇>ϕ).

When we use the square of the gradient, the Laplacian ∇², we need to contract two differentiation operators, which means that one must be a bra (covariant) and the other a ket (contravariant). So we write

∇² = (∂ϕ/∂xⁱ)(∂ϕ/∂xᵢ).

Notice how for derivatives, covariant goes as superscript (in the denominator) while contravariant goes as subscript.

There is a more compact form of writing these differentiation operators that can be less confusing:

∂/∂xⁱ = ∂ᵢ     ∂/∂xᵢ = ∂ⁱ

which I think is better, since then we always have contravariant with superscripts and covariants as subscripts.

In this notation, the gradient of ϕ is just ∂ᵢϕ. The divergence of Aⁱ is ∂ᵢAⁱ. And the Laplacian of f is ∂ᵢ∂ⁱf.

There is yet another notation, very common in many books, using commas. As far as I know there are no subscript and superscript commas in Unicode, so I will use just , and ':

∂ϕ/∂xⁱ = ϕ,ᵢ       ∂ϕ/∂xᵢ = ϕ'ⁱ  .

Is it necessary to choose a notation? Not at all. It is better that you are able to understand all of them, in order to better communicate to others, to understand all articles in literature, all books, and to use the best of each choice in each situation. I particularly like ∂ᵢ and ∂ⁱ because it is compact, has the indices where they should be, and allow to use naked operators without explicitly calling for the objects upon which they operate.

In 3d, we can integrate over a path (or curve), over a surface and over a volume. In 4d there are four types of integral:

1) Integral over a path in four-space, like a world line. The element of integration is the vector giving us the desired direction at each point: dxⁱ.

2) Integral over a 2d surface. We know that in 3d we can find the area formed by the parallelogram made by |dr> and |dr'> as |dr> ⨯ |dr'>, and this gives a vector whose components are the projections of that area along the three directions of space. These projections are the three components of the cross product, so the projection for the plane xᵅxᵝ is

dxᵅdx'ᵝ - dxᵝdx'ᵅ .

So in 3d this is the element of integration. But in 4d it is not so easy to visualise. However, we follow the analogy. The integration element of surface is given by the antisymmetric tensor dfⁱᵏ = dxⁱdx'ᵏ - dxᵏdxⁱ. In this case, this element is again the projection of the area element on the plane ik.

In 3d, a surface element is characterised by its normal vector: a unit vector perpendicular to the infinitesimal area. Its sign is given by the order in which the vectors of the parallelogram are given. We multiply this normal vector by the magnitude of the area and we finally obtain our surface element.

Remember that this surface element pseudovector dfᵅ is dual to the tensor dfᵅᵝ, so that dfᵅ = (1/2)εᵅᵝᵞdfᵝᵞ.

In 4d we cannot build such a vector, but we can take the tensor dfⁱᵏ, which does exist, and take its dual

df*ⁱᵏ = (1/2) εⁱᵏˡᵐdfₗₘ .

This tensor describes an element of surface which is equal in magnitude and "perpendicular" to the surface dfⁱᵏ. This perpendicularity is reflected by dfⁱᵏdf*ᵢₖ = 0.

3) Integral over a hypersurface (a 3d manifold). In 3d, the volume given by the parallelepiped formed by three vectors is the determinant of these three vectors. So for infinitesimal vectors dxⁱ, dx'ⁱ and dx"ⁱ, the hypersurface element is the rank 3 tensor

           | dxⁱ   dx'ⁱ   dx"ⁱ |
           |                   |
  dSⁱᵏˡ =  | dxᵏ   dx'ᵏ   dx"ᵏ | .
           |                   |
           | dxˡ   dx'ˡ   dx"ˡ |

This 3-rank tensor has as dual a 4-vector dSⁱ = -(1/6)εⁱᵏˡᵐdSₖₗₘ. You may wonder about these prefactors that appear in duals. In this case it is 1/6 because 3!=6, and to form the dual we contract three dummy indices, so we have 6 non-zero terms. If we invert the expression we get dSₖₗₘ = εₙₖₗₘdSⁿ. The prefactor is a 1 because we only sum over one dummy index.

The geometrical interpretation of dSⁱ is that of a 4-v being equal in magnitude to the hyper-area of the surface element, and perpendicular to that element. In particular, and to further illustrate the nature of duality:

dS⁰ = dS¹²³
dS¹ = dS⁰²³
dS² = dS⁰¹³
dS³ = dS⁰¹².

See how the dual has the components that the other does not have. This also makes clear how the dual of a rank 2 is another rank 2 (as we are having 4 dimensions).

Notice how dS⁰ is just dx·dy·dz, the three dimensional volume of xyz space, usually written as dV = dx·dy·dz. This dV is the projection of the hypersurface element on the hyperplane x⁰=const (in other words, at a given time).

4) Integral over a 4d volume. The element of the integration is dΩ = dx⁰dx¹dx²dx³ = cdt·dV.

This element is a scalar, and it is invariant by a rotation of the coordinates.

The Jacobian

When we transform, in an integral, from variables x⁰, x¹, x², x³ to new variables x'⁰, x'¹, x'², x'³, the element of integration does NOT change from dΩ = dx⁰dx¹dx²dx³ to dΩ' = dx'⁰dx'¹dx'²dx'³ as one would naively belive. Instead, it changes from dΩ to J·dΩ', where dΩ' is what we just wrote and J is the *Jacobian* of the transformation. The Jacobian is written as

J = ∂(new variables) / ∂(old variables).

Suppose that there are functions f⁰, f¹, f² and f³ that map fⁱ(xⁱ) ---> x'ⁱ. Then,

J = ∂(fx⁰,f¹,f²,f³) / ∂(x⁰,x¹,x²,x³).

This notation is new. It can be more explicitly written (I like this one more) as

    | ∂₀f⁰   ∂₁f⁰   ∂₂f⁰   ∂₃f⁰ |   
    |                           |
    | ∂₀f¹   ∂₁f¹   ∂₂f¹   ∂₃f¹ |
J = |                           | .
    | ∂₀f²   ∂₁f²   ∂₂f²   ∂₃f² |
    |                           |
    | ∂₀f³   ∂₁f³   ∂₂f³   ∂₃f³ |

You can interpret the above as a matrix (the Jacobian matrix) or as a determinant (the Jacobian determinant or just "the Jacobian"). For the integration element we use the determinant, of course.

For rotations, J = 1, which makes dΩ' = dΩ, so the hypervolume remains constant.

The Jacobian matrix and its transpose are fundamental objects to learn. I will try to convince you of this.

Suppose you have a scalar field f and you want to take its total derivative (along a given direction). Then, what you do is to build an object that does not depend on the chosen direction (we can call it "universal object"). This universal object, in this case, is the gradient <∇|f. Then, to get the total derivative along a chosen direction |dr>, we just to <∇|f|dr>, which gives the projection of the universal object <∇|f along the particular object |dr>. We have discussed this above, so we already knew it.

Now suppose we have a vector field |g> instead of the scalar field f. But still want to calculate the total derivative of this vector field along a given direction. But now it is not so easy! However, the philosophy of building a universal object and then projecting along the particular direction still holds.

The first question is, obviously, what is that magical universal object for a vector field? For a scalar field it was the gradient of that field. For a vector field |g>, the universal object is the Jacobian matrix J of the field with respect the coordinates, J = ∂(|g>) / ∂(|r>) or just J(|g>,|r>), which for 3d coordinates give

             ( ∂₁g¹   ∂₂g¹   ∂₃g¹ )  
             (                    )
J(|g>,|r>) = ( ∂₁g²   ∂₂g²   ∂₃g² ) = J(|g>),
             (                    ) 
             ( ∂₁g³   ∂₂g³   ∂₃g³ )

where the |r> dependence sees quite unnecessary here.

As you see, I am using now J for the matrix, and for the determinant we can use |J|.

In essence, then,

d|g> = J(|g>)·|dr>

or more explicitly,

       ( ∂₁g¹   ∂₂g¹   ∂₃g¹ )   ( dx¹)
       (                    )   (    )
d|g> = ( ∂₁g²   ∂₂g²   ∂₃g² ) · ( dx²) .
       (                    )   (    )
       ( ∂₁g³   ∂₂g³   ∂₃g² )   ( dx³)

Why is the transpose Jᵀ so important as well? Notice what happens when we multiply Jᵀ by the same direction vector as above:

( ∂₁g¹   ∂₁g²   ∂₁g³ )   ( dx¹)
(                    )   (    )
( ∂₂g¹   ∂₂g²   ∂₂g³ ) · ( dx²)  = <∇| <g|dr> >.
(                    )   (    )
( ∂₃g¹   ∂₃g²   ∂₃g³ )   ( dx³)

In other words, Jᵀ·|dr> gives the gradient of the scalar field formed by |g> and |dr>.

In classical mechanics, we know how Euler-Lagrange equations relate a total derivative with the gradient of a field. For example, Newton's 2nd law for a potential U,

d|p> = <∇|(-U·dt)

where we notice how in classical mechanics, bras and kets should not be distinguished, since they are happily mixed. But see how a total derivative on the left is equalled to a gradient of a scalar on the right. This means we can write

 J|dr> = Jᵀ|dr>

 ( J(|p>) - Jᵀ(|p>) )|dr> = |0>.

This shape is particularly interesting, since the matrix J-Jᵀ is antisymmetric by construction, and has zero diagonal terms!

Gauss' theorem.

In 3d vector calculus, when we have a closed surface S that encloses a volume V, and there is a vector field |A>, we can compute the flux of the field across the surface as

flux of |A> across closed surface S = ∮<A|dS>

where <A| is a bra form of |A> and where |dS> is a vector assigned to each infinitesimal piece of surface dS with magnitude equal to dS and a perpendicular direction to the little surface. The symbol ∮ means "integrate over a closed object". In this case, the surface is closed, hence the ∮. The concept of flux can be applied to a non-closed surface, but then Gauss' theorem does not apply.

Gauss' theorem says that the flux of |A> across S equals the volume integral of the divergence of the same field, so

∮<A|dS> = ∫<∇|A>dV   or  ∮<dS|A> = ∫<∇|A>dV. (Gauss' theorem 3d.)

This theorem is of fundamental importance in many areas, but in EM it is of paramount relevance.

We can generalise this theorem to 4d. We deal with an integral over a closed hypersurface, which is our "flux" or our "hyperflux". Then, the theorem can be written as

∮AⁱdSᵢ = ∮∂ᵢAⁱdΩ .    (Gauss' theorem in 4d.)

In essence, Gauss' theorem is about changing from dSᵢ to dΩ∂ᵢ, or in 3d vector calculus, from <dS| to dV<∇|.

Stokes' theorem.

The name "Stokes' theorem" can be confusing, since sometimes we use this name for a generalised version that includes the content of several vector calculus theorems. However, here we use this name to refer to what is also know as the curl theorem.

In 3d, given a vector field |A>, we can consider, instead of closed surfaces, a closed path Γ which encloses an open surface S. As for the closed surface we calculated the flux of |A> across S, we now calculate the *circulation* of A along Γ. This is done via the integral ∮<A|dl> or ∮<dl|A>. In these theorems it is more convenient to think the field as the true vector and the other "vector" as the form.

The vector |dl> or better <dl| has a magnitude equal to dl, which gives the infinitesimal increment along the path, and has a direction tangent to such path.

Stokes' theorem says that the circulation of |A> along a path Γ is

∮<dl|A> = ∫<dS|·(|∇>⨯|A>) ,

where the second integral is performed over the surface S that the path Γ encloses.

For Gauss' theorem we substituted the integral element dSᵢ by the operator dΩ∂ᵢ. Notice how we go from element to operator, and how we go from surface element to volume element.

By analogy, the line element dxᵢ can be substituted by an operator in which we also use the element of surface dfₖᵢ. We do

dSᵢ ---> dΩ∂ᵢ   
dxᵢ ---> dfₖᵢ∂.

This means that we can write

∮Aⁱdxᵢ = ∫ dfₖᵢ∂ᵏAⁱ.

This generalises Stokes' theorem. Notice the expression ∂ᵏAⁱ, with both objects being covariant. It is the analogue of |∇>⨯|A> we got for 3d.

Problems:

1. Given a symmetric 4-T Aⁱᵏ, find how it transforms under a Lorentz transf along x. (A boost in x.)

Recall that the components of a 4-T can be considered as the products of components of two 4-v. This means that we can write

Aⁱᵏ as BⁱCᵏ.

For a 4-v Aⁱ, since it is an x-boost, A'²=A² and A'³=A³. But let's recall that, if we use the shorthand b=u/c and remembering the meaning of √,

B⁰ = (B'⁰ + bB'¹)/√   and   B¹ = (B'¹ + bB'⁰)/√ .

And similarly for C. Then, the component A⁰⁰ transforms like B⁰C⁰:

A⁰⁰ = B⁰C⁰ = (1/√)²·(B'⁰+bB'¹)(C'⁰+bC'¹) =

= (1/√)²·(B'⁰C'⁰ + bB'¹C'⁰ + bB'⁰C'¹ + b²B'¹C'¹) = 

= (1/√)²·(A'⁰⁰ + 2bA'⁰¹ + b²A'¹¹),

where we have used the fact that the tensor is symmetric, as in B'¹C'⁰ = B'⁰C'¹. We also calculate A⁰¹, and the rest are too trivial, since A¹¹ is like A⁰⁰ and the others involve indices 2 and 3 which are not transformed:

A⁰¹ = B⁰C¹ = (1/√)²·(B'⁰+bB'¹)(C'¹+bC'⁰) =

= (1/√)²·(B'⁰C'¹[1+b²] + bB'¹C'¹ + bB'⁰C'⁰) = 

= (1/√)²·(A'⁰¹[1+b²] + bA'⁰⁰ + bA'¹¹).

2. The same but now Aⁱᵏ is antisymmetric.

Again, all components involving indices 2 and 3 are trivial. Let's go for A'⁰⁰:

A⁰⁰ = B⁰C⁰ = (1/√)²·(B'⁰+bB'¹)(C'⁰+bC'¹) =

= (1/√)²·(B'⁰C'⁰+bB'⁰C'¹+bB'¹C'⁰+b²B'¹C'¹) =

= (1/√)²·(A'⁰⁰+bA'⁰¹+bA'¹⁰+b²A'¹¹) = 0.

Here we have used that A'⁰¹ = -A'¹⁰ and that all diagonal terms are 0.

A⁰¹ = (1/√)²·(B'⁰+bB'¹)(C'¹+bC'⁰) = (1/√)²·(A'⁰¹+bA'⁰⁰+bA'¹¹+b²A'¹⁰) = 

= (1/√)²·A'⁰¹·(1-b²) = A'⁰¹.