The Zassenhaus formula and the algebra of the Pauli matrices


Edgardo S. Cheb-Terrab1 and Bryan C. Sanctuary2

(1) Maplesoft

(2) Department of Chemistry, McGill University, Montreal, Quebec, Canada



The implementation of the Pauli matrices and their algebra were reviewed during 2018, including the algebraic manipulation of nested commutators, resulting in faster computations using simpler and more flexible input. As it frequently happens, improvements of this type suddenly transform research problems presented in the literature as untractable in practice, into tractable.


As an illustration, we tackle below the derivation of the coefficients entering the Zassenhaus formula shown in section 4 of [1] for the Pauli matrices up to order 10 (results in the literature go up to order 5). The computation presented can be reused to compute these coefficients up to any desired higher-order (hardware limitations may apply). A number of examples which exploit this formula and its dual, the Baker-Campbell-Hausdorff formula, occur in connection with the Weyl prescription for converting a classical function to a quantum operator (see sec. 5 of [1]), as well as when solving the eigenvalue problem for classes of mathematical-physics partial differential equations [2].  
To reproduce the results below - a worksheet with this contents is linked at the end - you need to have your Maple 2018.2.1 updated with the 
Maplesoft Physics Updates version 280 or higher.



[1] R.M. Wilcox, "Exponential Operators and Parameter Differentiation in Quantum Physics", Journal of Mathematical Physics, V.8, 4, (1967.


[2] S. Steinberg, "Applications of the lie algebraic formulas of Baker, Campbell, Hausdorff, and Zassenhaus to the calculation of explicit solutions of partial differential equations", Journal of Differential Equations, V.26, 3, 1977.


[3] K. Huang, "Statistical Mechanics", John Wiley & Sons, Inc. 1963, p217, Eq.(10.60).


Formulation of the problem

The Zassenhaus formula expresses exp(lambda*(A+B)) as an infinite product of exponential operators involving nested commutators of increasing complexity

"(e)^(lambda (A+B))   =    (e)^(lambda A) * (e)^(lambda B) * (e)^(lambda^2 C[2]) * (e)^(lambda^3 C[3]) *  ...  "
                                                                       =   exp(lambda*A)*exp(lambda*B)*exp(-(1/2)*lambda^2*%Commutator(A, B))*exp((1/6)*lambda^3*(%Commutator(B, %Commutator(A, B))+2*%Commutator(A, %Commutator(A, B))))

Given A, B and their commutator E = %Commutator(A, B), if A and B commute with E, C[n] = 0 for n >= 3 and the Zassenhaus formula reduces to the product of the first three exponentials above. The interest here is in the general case, when %Commutator(A, E) <> 0 and %Commutator(B, E) <> 0, and the goal is to compute the Zassenhaus coefficients C[n]in terms of A, B for arbitrary finite n. Following [1], in that general case, differentiating the Zassenhaus formula with respect to lambda and multiplying from the right by exp(-lambda*(A+B)) one obtains

"A+B=A+(e)^(lambda A) B (e)^(-lambda A)+(e)^(lambda A)+(e)^(lambda B) 2 lambda C[2] (e)^(-lambda B) (e)^(-lambda A)+ ..."

This is an intricate formula, which however (see eq.(4.20) of [1]) can be represented in abstract form as


"0=((&sum;)(lambda^n)/(n!) {A^n,B})+2 lambda ((&sum;) (&sum;)(lambda^(n+m))/(n! m!) {A^m,B^n,C[2]})+3 lambda^2 ((&sum;) (&sum;) (&sum;)(lambda^(n+m+k))/(n! m! k!) {A^k,B^m,(C[2])^n,C[3]})+ ..."

from where an equation to be solved for each C[n] is obtained by equating to 0 the coefficient of lambda^(n-1). In this formula, the repeated commutator bracket is defined inductively in terms of the standard commutator %Commutator(A, B)by

{B, A^0} = B, {B, A^(n+1)} = %Commutator(A, {A^n, B^n})

{C[j], B^n, A^0} = {C[j], B^n}, {C[j], A^m, B^n} = %Commutator(A, {A^`-`(m, 1), B^n, C[j]^k})

and higher-order repeated-commutator brackets are similarly defined. For example, taking the coefficient of lambda and lambda^2 and respectively solving each of them for C[2] and C[3] one obtains

C[2] = -(1/2)*%Commutator(A, B)

C[3] = (1/6)*%Commutator(B, %Commutator(A, B))+(1/3)*%Commutator(B, %Commutator(A, B))

This method is used in [3] to treat quantum deviations from the classical limit of the partition function for both a Bose-Einstein and Fermi-Dirac gas. The complexity of the computation of C[n] grows rapidly and in the literature only the coefficients up to C[5] have been published. Taking advantage of developments in the Physics package during 2018, below we show the computation up to C[10] and provide a compact approach to compute them up to arbitrary finite order.


Computing up to C[10]

Set the signature of spacetime such that its space part is equal to +++ and use lowercaselatin letters to represent space indices. Set also A, B and C[n] to represent quantum operators


Setup(op = {A, B, C}, signature = `+++-`, spaceindices = lowercaselatin)

[quantumoperators = {A, B, C}, signature = `+ + + -`, spaceindices = lowercaselatin]


To illustrate the computation up to C[10], a convenient example, where the commutator algebra is closed, consists of taking A and B as Pauli Matrices which, multiplied by the imaginary unit, form a basis for the `&sfr;&ufr;`(2)group, which in turn exponentiate to the relevant Special Unitary Group SU(2). The algebra for the Pauli matrices involves a commutator and an anticommutator


%Commutator(Physics:-Psigma[i], Physics:-Psigma[j]) = (2*I)*Physics:-LeviCivita[i, j, k]*Physics:-Psigma[k], %AntiCommutator(Physics:-Psigma[i], Physics:-Psigma[j]) = 2*Physics:-KroneckerDelta[i, j]


Assign now A and B to two Pauli matrices, for instance

A := Psigma[1]



B := Psigma[3]



Next, to extract the coefficient of lambda^n from

"0=((&sum;)(lambda^n)/(n!) {A^n,B})+2 lambda ((&sum;) (&sum;)(lambda^(n+m))/(n! m!) {A^m,B^n,C[2]})+3 lambda^2 ((&sum;) (&sum;) (&sum;)(lambda^(n+m+k))/(n! m! k!) {A^k,B^m,(C[2])^n,C[3]})+..."

to solve it for C[n+1] we note that each term has a factor lambda^m multiplying a sum, so we only need to take into account the first n+1 terms (sums) and in each sum replace infinity by the corresponding n-m. For example, given "C[2]=-1/2 `%Commutator`(A,B), "to compute C[3] we only need to compute these first three terms:

0 = Sum(lambda^n*{B, A^n}/factorial(n), n = 1 .. 2)+2*lambda*(Sum(Sum(lambda^(n+m)*{C[2], A^m, B^n}/(factorial(n)*factorial(m)), n = 0 .. 1), m = 0 .. 1))+3*lambda^2*(Sum(Sum(Sum(lambda^(n+m+k)*{C[3], A^k, B^m, C[2]^n}/(factorial(n)*factorial(m)*factorial(k)), n = 0 .. 0), m = 0 .. 0), k = 0 .. 0))

then solving for C[3] one gets C[3] = (1/3)*%Commutator(B, %Commutator(A, B))+(1/6)*%Commutator(A, %Commutator(A, B)).

Also, since to compute C[n] we only need the coefficient of lambda^(n-1), it is not necessary to compute all the terms of each multiple-sum. One way of restricting the multiple-sums to only one power of lambda consists of using multi-index summation, available in the Physics package (see Physics:-Library:-Add ). For that purpose, redefine sum to extend its functionality with multi-index summation

Setup(redefinesum = true)

[redefinesum = true]


Now we can represent the same computation of C[3] without multiple sums and without computing unnecessary terms as

0 = Sum(lambda^n*{B, A^n}/factorial(n), n = 1)+2*lambda*(Sum(lambda^(n+m)*{C[2], A^m, B^n}/(factorial(n)*factorial(m)), n+m = 1))+3*lambda^2*(Sum(lambda^(n+m+k)*{C[3], A^k, B^m, C[2]^n}/(factorial(n)*factorial(m)*factorial(k)), n+m+k = 0))

Finally, we need a computational representation for the repeated commutator bracket 

{B, A^0} = B, {B, A^(n+1)} = %Commutator(A, {A^n, B^n})

One way of representing this commutator bracket operation is defining a procedure, say F, with a cache to avoid recomputing lower order nested commutators, as follows

F := proc (A, B, n) options operator, arrow; if n::negint then 0 elif n = 0 then B elif n::posint then %Commutator(A, F(A, B, n-1)) else 'F(A, B, n)' end if end proc

proc (A, B, n) options operator, arrow; if n::negint then 0 elif n = 0 then B elif n::posint then %Commutator(A, F(A, B, n-1)) else 'F(A, B, n)' end if end proc


Cache(procedure = F)


For example,

F(A, B, 1)

%Commutator(Physics:-Psigma[1], Physics:-Psigma[3])


F(A, B, 2)

%Commutator(Physics:-Psigma[1], %Commutator(Physics:-Psigma[1], Physics:-Psigma[3]))


F(A, B, 3)

%Commutator(Physics:-Psigma[1], %Commutator(Physics:-Psigma[1], %Commutator(Physics:-Psigma[1], Physics:-Psigma[3])))


We can set now the value of C[2]

C[2] := -(1/2)*Commutator(A, B)



and enter the formula that involves only multi-index summation

H := sum(lambda^n*F(A, B, n)/factorial(n), n = 2)+2*lambda*(sum(lambda^(n+m)*F(A, F(B, C[2], n), m)/(factorial(n)*factorial(m)), n+m = 1))+3*lambda^2*(sum(lambda^(n+m+k)*F(A, F(B, F(C[2], C[3], n), m), k)/(factorial(n)*factorial(m)*factorial(k)), n+m+k = 0))

(1/2)*lambda^2*%Commutator(Physics:-Psigma[1], %Commutator(Physics:-Psigma[1], Physics:-Psigma[3]))+2*lambda*(lambda*%Commutator(Physics:-Psigma[1], I*Physics:-Psigma[2])+lambda*%Commutator(Physics:-Psigma[3], I*Physics:-Psigma[2]))+3*lambda^2*C[3]


from where we compute C[3] by solving for it the coefficient of lambda^2, and since due to the mulit-index summation this expression already contains lambda^2 as a factor,

C[3] = Simplify(solve(H, C[3]))

C[3] = (2/3)*Physics:-Psigma[3]-(4/3)*Physics:-Psigma[1]


In order to generalize the formula for H for higher powers of lambda, the right-hand side of the multi-index summation limit can be expressed in terms of an abstract N, and H transformed into a mapping:


H := unapply(sum(lambda^n*F(A, B, n)/factorial(n), n = N)+2*lambda*(sum(lambda^(n+m)*F(A, F(B, C[2], n), m)/(factorial(n)*factorial(m)), n+m = N-1))+3*lambda^2*(sum(lambda^(n+m+k)*F(A, F(B, F(C[2], C[3], n), m), k)/(factorial(n)*factorial(m)*factorial(k)), n+m+k = N-2)), N)

proc (N) options operator, arrow; lambda^N*F(Physics:-Psigma[1], Physics:-Psigma[3], N)/factorial(N)+2*lambda*(sum(Physics:-`*`(Physics:-`^`(lambda, n+m), Physics:-`^`(Physics:-`*`(factorial(n), factorial(m)), -1), F(Physics:-Psigma[1], F(Physics:-Psigma[3], I*Physics:-Psigma[2], n), m)), n+m = N-1))+3*lambda^2*(sum(Physics:-`*`(Physics:-`^`(lambda, n+m+k), Physics:-`^`(Physics:-`*`(factorial(n), factorial(m), factorial(k)), -1), F(Physics:-Psigma[1], F(Physics:-Psigma[3], F(I*Physics:-Psigma[2], C[3], n), m), k)), n+m+k = N-2)) end proc


Now we have





lambda*%Commutator(Physics:-Psigma[1], Physics:-Psigma[3])+(2*I)*lambda*Physics:-Psigma[2]


The following is already equal to (11)


(1/2)*lambda^2*%Commutator(Physics:-Psigma[1], %Commutator(Physics:-Psigma[1], Physics:-Psigma[3]))+2*lambda*(lambda*%Commutator(Physics:-Psigma[1], I*Physics:-Psigma[2])+lambda*%Commutator(Physics:-Psigma[3], I*Physics:-Psigma[2]))+3*lambda^2*C[3]


In this way, we can reproduce the results published in the literature for the coefficients of Zassenhaus formula up to C[5] by adding two more multi-index sums to (13). Unassign C first


H := unapply(sum(lambda^n*F(A, B, n)/factorial(n), n = N)+2*lambda*(sum(lambda^(n+m)*F(A, F(B, C[2], n), m)/(factorial(n)*factorial(m)), n+m = N-1))+3*lambda^2*(sum(lambda^(n+m+k)*F(A, F(B, F(C[2], C[3], n), m), k)/(factorial(n)*factorial(m)*factorial(k)), n+m+k = N-2))+4*lambda^3*(sum(lambda^(n+m+k+l)*F(A, F(B, F(C[2], F(C[3], C[4], n), m), k), l)/(factorial(n)*factorial(m)*factorial(k)*factorial(l)), n+m+k+l = N-3))+5*lambda^4*(sum(lambda^(n+m+k+l+p)*F(A, F(B, F(C[2], F(C[3], F(C[4], C[5], n), m), k), l), p)/(factorial(n)*factorial(m)*factorial(k)*factorial(l)*factorial(p)), n+m+k+l+p = N-4)), N)

We compute now up to C[5] in one go

for j to 4 do C[j+1] := Simplify(solve(H(j), C[j+1])) end do









The nested-commutator expression solved in the last step for C[5] is


(1/24)*lambda^4*%Commutator(Physics:-Psigma[1], %Commutator(Physics:-Psigma[1], %Commutator(Physics:-Psigma[1], %Commutator(Physics:-Psigma[1], Physics:-Psigma[3]))))+2*lambda*((1/6)*lambda^3*%Commutator(Physics:-Psigma[1], %Commutator(Physics:-Psigma[1], %Commutator(Physics:-Psigma[1], I*Physics:-Psigma[2])))+(1/2)*lambda^3*%Commutator(Physics:-Psigma[1], %Commutator(Physics:-Psigma[1], %Commutator(Physics:-Psigma[3], I*Physics:-Psigma[2])))+(1/2)*lambda^3*%Commutator(Physics:-Psigma[1], %Commutator(Physics:-Psigma[3], %Commutator(Physics:-Psigma[3], I*Physics:-Psigma[2])))+(1/6)*lambda^3*%Commutator(Physics:-Psigma[3], %Commutator(Physics:-Psigma[3], %Commutator(Physics:-Psigma[3], I*Physics:-Psigma[2]))))+3*lambda^2*((1/2)*lambda^2*%Commutator(Physics:-Psigma[1], %Commutator(Physics:-Psigma[1], (2/3)*Physics:-Psigma[3]-(4/3)*Physics:-Psigma[1]))+lambda^2*%Commutator(Physics:-Psigma[1], %Commutator(Physics:-Psigma[3], (2/3)*Physics:-Psigma[3]-(4/3)*Physics:-Psigma[1]))+(1/2)*lambda^2*%Commutator(Physics:-Psigma[3], %Commutator(Physics:-Psigma[3], (2/3)*Physics:-Psigma[3]-(4/3)*Physics:-Psigma[1]))+lambda^2*%Commutator(Physics:-Psigma[1], %Commutator(I*Physics:-Psigma[2], (2/3)*Physics:-Psigma[3]-(4/3)*Physics:-Psigma[1]))+lambda^2*%Commutator(Physics:-Psigma[3], %Commutator(I*Physics:-Psigma[2], (2/3)*Physics:-Psigma[3]-(4/3)*Physics:-Psigma[1]))+(1/2)*lambda^2*%Commutator(I*Physics:-Psigma[2], %Commutator(I*Physics:-Psigma[2], (2/3)*Physics:-Psigma[3]-(4/3)*Physics:-Psigma[1])))+4*lambda^3*(lambda*%Commutator(Physics:-Psigma[1], -((1/3)*I)*((3*I)*Physics:-Psigma[1]+(6*I)*Physics:-Psigma[3]-4*Physics:-Psigma[2]))+lambda*%Commutator(Physics:-Psigma[3], -((1/3)*I)*((3*I)*Physics:-Psigma[1]+(6*I)*Physics:-Psigma[3]-4*Physics:-Psigma[2]))+lambda*%Commutator(I*Physics:-Psigma[2], -((1/3)*I)*((3*I)*Physics:-Psigma[1]+(6*I)*Physics:-Psigma[3]-4*Physics:-Psigma[2]))+lambda*%Commutator((2/3)*Physics:-Psigma[3]-(4/3)*Physics:-Psigma[1], -((1/3)*I)*((3*I)*Physics:-Psigma[1]+(6*I)*Physics:-Psigma[3]-4*Physics:-Psigma[2])))+5*lambda^4*(-(8/9)*Physics:-Psigma[1]-(158/45)*Physics:-Psigma[3]-((16/3)*I)*Physics:-Psigma[2])


With everything understood, we want now to extend these results generalizing them into an approach to compute an arbitrarily large coefficient C[n], then use that generalization to compute all the Zassenhaus coefficients up to C[10]. To type the formula for H for higher powers of lambda is however prone to typographical mistakes. The following is a program, using the Maple programming language , that produces these formulas for an arbitrary integer power of lambda:

Formula := proc(A, B, C, Q)


This Formula program uses a sequence of summation indices with as much indices as the order of the coefficient C[n] we want to compute, in this case we need 10 of them

summation_indices := n, m, k, l, p, q, r, s, t, u

n, m, k, l, p, q, r, s, t, u


To avoid interference of the results computed in the loop (17), unassign C again



Now the formulas typed by hand, used lines above to compute each of C[2], C[3] and C[5], are respectively constructed by the computer

Formula(A, B, C, 2)

sum(lambda^n*F(Physics:-Psigma[1], Physics:-Psigma[3], n)/factorial(n), n = N)+2*lambda*(sum(lambda^(n+m)*F(Physics:-Psigma[1], F(Physics:-Psigma[3], C[2], n), m)/(factorial(n)*factorial(m)), n+m = N-1))


Formula(A, B, C, 3)

sum(lambda^n*F(Physics:-Psigma[1], Physics:-Psigma[3], n)/factorial(n), n = N)+2*lambda*(sum(lambda^(n+m)*F(Physics:-Psigma[1], F(Physics:-Psigma[3], C[2], n), m)/(factorial(n)*factorial(m)), n+m = N-1))+3*lambda^2*(sum(lambda^(n+m+k)*F(Physics:-Psigma[1], F(Physics:-Psigma[3], F(C[2], C[3], n), m), k)/(factorial(n)*factorial(m)*factorial(k)), n+m+k = N-2))


Formula(A, B, C, 5)

sum(lambda^n*F(Physics:-Psigma[1], Physics:-Psigma[3], n)/factorial(n), n = N)+2*lambda*(sum(lambda^(n+m)*F(Physics:-Psigma[1], F(Physics:-Psigma[3], C[2], n), m)/(factorial(n)*factorial(m)), n+m = N-1))+3*lambda^2*(sum(lambda^(n+m+k)*F(Physics:-Psigma[1], F(Physics:-Psigma[3], F(C[2], C[3], n), m), k)/(factorial(n)*factorial(m)*factorial(k)), n+m+k = N-2))+4*lambda^3*(sum(lambda^(n+m+k+l)*F(Physics:-Psigma[1], F(Physics:-Psigma[3], F(C[2], F(C[3], C[4], n), m), k), l)/(factorial(n)*factorial(m)*factorial(k)*factorial(l)), n+m+k+l = N-3))+5*lambda^4*(sum(lambda^(n+m+k+l+p)*F(Physics:-Psigma[1], F(Physics:-Psigma[3], F(C[2], F(C[3], F(C[4], C[5], n), m), k), l), p)/(factorial(n)*factorial(l)*factorial(m)*factorial(k)*factorial(p)), n+m+k+l+p = N-4))



Construct then the formula for C[10] and make it be a mapping with respect to N, as done for C[5] after (16)

H := unapply(Formula(A, B, C, 10), N)

proc (N) options operator, arrow; sum(lambda^n*F(Physics:-Psigma[1], Physics:-Psigma[3], n)/factorial(n), n = N)+2*lambda*(sum(lambda^(n+m)*F(Physics:-Psigma[1], F(Physics:-Psigma[3], C[2], n), m)/(factorial(n)*factorial(m)), n+m = N-1))+3*lambda^2*(sum(lambda^(n+m+k)*F(Physics:-Psigma[1], F(Physics:-Psigma[3], F(C[2], C[3], n), m), k)/(factorial(n)*factorial(m)*factorial(k)), n+m+k = N-2))+4*lambda^3*(sum(lambda^(n+m+k+l)*F(Physics:-Psigma[1], F(Physics:-Psigma[3], F(C[2], F(C[3], C[4], n), m), k), l)/(factorial(n)*factorial(m)*factorial(k)*factorial(l)), n+m+k+l = N-3))+5*lambda^4*(sum(lambda^(n+m+k+l+p)*F(Physics:-Psigma[1], F(Physics:-Psigma[3], F(C[2], F(C[3], F(C[4], C[5], n), m), k), l), p)/(factorial(n)*factorial(l)*factorial(m)*factorial(k)*factorial(p)), n+m+k+l+p = N-4))+6*lambda^5*(sum(lambda^(n+m+k+l+p+q)*F(Physics:-Psigma[1], F(Physics:-Psigma[3], F(C[2], F(C[3], F(C[4], F(C[5], C[6], n), m), k), l), p), q)/(factorial(n)*factorial(l)*factorial(m)*factorial(p)*factorial(k)*factorial(q)), n+m+k+l+p+q = N-5))+7*lambda^6*(sum(lambda^(n+m+k+l+p+q+r)*F(Physics:-Psigma[1], F(Physics:-Psigma[3], F(C[2], F(C[3], F(C[4], F(C[5], F(C[6], C[7], n), m), k), l), p), q), r)/(factorial(n)*factorial(l)*factorial(m)*factorial(p)*factorial(q)*factorial(k)*factorial(r)), n+m+k+l+p+q+r = N-6))+8*lambda^7*(sum(lambda^(n+m+k+l+p+q+r+s)*F(Physics:-Psigma[1], F(Physics:-Psigma[3], F(C[2], F(C[3], F(C[4], F(C[5], F(C[6], F(C[7], C[8], n), m), k), l), p), q), r), s)/(factorial(n)*factorial(r)*factorial(l)*factorial(m)*factorial(p)*factorial(q)*factorial(k)*factorial(s)), n+m+k+l+p+q+r+s = N-7))+9*lambda^8*(sum(lambda^(n+m+k+l+p+q+r+s+t)*F(Physics:-Psigma[1], F(Physics:-Psigma[3], F(C[2], F(C[3], F(C[4], F(C[5], F(C[6], F(C[7], F(C[8], C[9], n), m), k), l), p), q), r), s), t)/(factorial(s)*factorial(n)*factorial(r)*factorial(l)*factorial(m)*factorial(p)*factorial(q)*factorial(k)*factorial(t)), n+m+k+l+p+q+r+s+t = N-8))+10*lambda^9*(sum(lambda^(n+m+k+l+p+q+r+s+t+u)*F(Physics:-Psigma[1], F(Physics:-Psigma[3], F(C[2], F(C[3], F(C[4], F(C[5], F(C[6], F(C[7], F(C[8], F(C[9], C[10], n), m), k), l), p), q), r), s), t), u)/(factorial(s)*factorial(n)*factorial(t)*factorial(r)*factorial(l)*factorial(m)*factorial(p)*factorial(q)*factorial(k)*factorial(u)), n+m+k+l+p+q+r+s+t+u = N-9)) end proc


Compute now the coefficients of the Zassenhaus formula up to C[10] all in one go

for j to 9 do C[j+1] := Simplify(solve(H(j), C[j+1])) end do



















Notes: with the material above you can compute higher order values of C[n]. For that you need:


Unassign C as done above in two opportunities, to avoid interference of the results just computed.


Indicate more summation indices in the sequence summation_indices in (19), as many as the maximum value of n in C[n].


Have in mind that the growth in size and complexity is significant, with each C[n] taking significantly more time than the computation of all the previous ones.


Re-execute the input line (23) and the loop (24).



Edgardo S. Cheb-Terrab
Physics, Differential Equations and Mathematical Functions, Maplesoft

