Ang pahinang ito ay hindi pa naisalin. Nakikita mo ang orihinal na bersyon sa Ingles.
Krylov quantum diagonalization
In this lesson on Krylov quantum diagonalization (KQD) we will answer the following:
- What is the Krylov method, generally?
- Why does the Krylov method work and under what conditions?
- How does quantum computing play a role?
The quantum part of the calculations are based largely on work in Ref [1].
The video below gives an overview of Krylov methods in classical computing, motivates their use, and explains how quantum computing can play a role in that workstream. The subsequent text offers more detail and implements a Krylov method both classically, and using a quantum computer.
1. Introduction to Krylov methods
A Krylov subspace method can refer to any of several methods built around what is called the Krylov subspace. A complete review of these is beyond the scope of this lesson, but Ref [2-4] can all give substantially more background. Here, we will focus on what a Krylov subspace is, how and why it is useful in solving eigenvalue problems, and finally how it can be implemented on a quantum computer. Definition: Given a symmetric, positive semi-definite matrix , the Krylov space of order is the space spanned by vectors obtained by multiplying higher powers of a matrix , up to , with a reference vector .
Although the vectors above span what we call a Krylov subspace, there is no reason to think that they will be orthogonal. One often uses an iterative orthonormalizing process similar to Gram-Schmidt orthogonalization. Here the process is slightly different since each new vector is made orthogonal to the others as it is generated. In this context this is called Arnoldi iteration. Starting with the initial vector , one generates the next vector , and then ensures that this second vector is orthogonal to the first by subtracting off its projection on . That is
It is now easy to see that since
We do the same for the next vector, ensuring it is orthogonal to both the previous two:
If we repeat this process for all vectors, we have a complete orthonormal basis for a Krylov space. Note that the orthogonalization process here will yield zero once , since orthogonal vector necessarily span the full space. The process will also yield zero if any vector is an eigenvector of since all subsequent vectors will be multiples of that vector.
1.1 A simple example: Krylov by hand
Let us step through a generation of a Krylov subspace generation on a trivially small matrix, so that we can see the process. We start with an initial matrix of interest to us:
For this small example, we can determine the eigenvectors and eigenvalues easily even by hand. We show the numerical solution here.
# One might use linalg.eigh here, but later matrices may not be Hermitian. So we use linalg.eig in this lesson.
import numpy as np
A = np.array([[4, -1, 0], [-1, 4, -1], [0, -1, 4]])
eigenvalues, eigenvectors = np.linalg.eig(A)
print("The eigenvalues are ", eigenvalues)
print("The eigenvectors are ", eigenvectors)
The eigenvalues are [2.58578644 4. 5.41421356]
The eigenvectors are [[ 5.00000000e-01 -7.07106781e-01 5.00000000e-01]
[ 7.07106781e-01 1.37464400e-16 -7.07106781e-01]
[ 5.00000000e-01 7.07106781e-01 5.00000000e-01]]
We record them here for later comparison:
We would like to study how this process works (or fails) as we increase the dimension of our Krylov subspace, . To this end, we will apply this process:
- Generate a subspace of the full vector space starting with a randomly-chosen vector (call it if it is already normalized, as above).
- Project the full matrix onto that subspace, and find the eigenvalues of that projected matrix .
- Increase the size of the subspace by generating more vectors, ensuring that they are orthonormal, using a process similar to Gram-Schmidt orthogonalization.
- Project onto the larger subspace and find the eigenvalues of the resulting matrix, .
- Repeat this until the eigenvalues converge (or in this toy case, until you have generated vectors spanning the full vector space of the original matrix ).
A normal implementation of the Krylov method would not need to solve the eigenvalue problem for the matrix projected on every Krylov subspace as it is built. You could construct the subspace of the desired dimension, project the matrix onto that subspace, and diagonalize the projected matrix. Projecting and diagonalizing at each subspace dimension is only done for checking convergence.
Dimension :
We choose a random vector, say
If it is not already normalized, normalize it.
We now project our matrix onto the subspace of this one vector:
This is our projection of the matrix onto our Krylov subspace when it contains just a single vector, . The eigenvalue of this matrix is trivially 4. We can think of this as our zeroth-order estimate of the eigenvalues (in this case just one) of . Although it is a poor estimate, it is the correct order of magnitude.
Dimension :
We now generate the next vector in our subspace through operation with on the previous vector:
Now we subtract off the projection of this vector onto our previous vector to ensure orthogonality.
If it is not already normalized, normalize it. In this case, the vector was already normalized, so
We now project our matrix A onto the subspace of these two vectors:
We are still left with the problem of determining the eigenvalues of this matrix. But this matrix is slightly smaller than the full matrix. In problems involving very large matrices, working with this smaller subspace may be highly advantageous.
Although this is still not a good estimate, it is better than the zeroth order estimate. We will carry this out for one more iteration, to ensure the process is clear. However, this undercuts the point of the method, since we will end up diagonalizing a 3x3 matrix in the next iteration, meaning we have not saved time or computational power.
Dimension :
We now generate the next vector in our subspace through operation with A on the previous vector:
Now we subtract off the projection of this vector onto both our previous vectors to ensure orthogonality.
If it is not already normalized, normalize it. In this case, the vector was already normalized, so
We now project our matrix onto the subspace of these vectors:
We now determine the eigenvalues:
These eigenvalues are exactly the eigenvalues of the original matrix . This must be the case, since we have expanded our Krylov subspace to span the entire vector space of the original matrix .
In this example, the Krylov method may not appear particularly easier than direct diagonalization. Indeed, as we will see in later sections, the Krylov method is only advantageous above a certain matrix dimension; this is intended to help us solve eigenvalue/eigenvector problems of extremely large matrices.

This is the only example we will show worked “by hand”, but section 2 below shows computational examples.
Clarification of terms
A common misconception is that there is just a single Krylov subspace for a given problem. But of course, since there are many initial vectors to which our matrix could be applied, there are many possible Krylov subspaces. We will only use the phrase "the Krylov subspace" to refer to a specific Krylov subspace already defined for a specific example. For general problem-solving approaches we will refer to "a Krylov subspace". A final clarification is that it is valid to refer to a "Krylov space". One often sees it called a "Krylov subspace" because of its use in the context of projecting matrices from an initial space into a subspace. In keeping with that context, we will mostly refer to it as a subspace here.
Check your understanding
Read the questions below, think about your answer, then click the triangle to reveal each solution.
Explain why it is not (a) useful, and (b) possible to extend the dimension of the Krylov subspace beyond the dimension of the matrix of interest.
Answer:
(a) Since we are orthonormalizing the vectors as we produce them, a set of such vectors will form a complete basis, meaning a linear combination of them can be used to create any vector in the space. (b) The orthogonalization process consists of subtracting off the projection of a new vector onto all previous vectors. If all previous vectors span the full vector space, then subtracting off projections onto the full subspace will always leave us with a zero vector.
Suppose a fellow researcher is demonstrating the Krylov method applied to a small toy matrix for some colleagues. This fellow researcher plans to use the matrix and initial vector:
and
Is there something wrong with this? How would you advise your colleague?
Answer:
Your colleague has accidentally chosen an eigenvector for his/her initial vector. Acting with the matrix on the initial vector will simply return the same vector back, scaled by the eigenvalue. This will not generate a subspace of increasing dimension. Advise your colleague to select a different initial vector, making sure it is not an eigenvector.
Apply the Krylov method to the matrix below, selecting an appropriate new initial vector. Write down the estimates of the minimum eigenvalue at the 0th and 1st order of your Krylov subspace.
Answer:
There are many possible answers depending on the choice of initial vector. We will choose:
To get we apply once to , and then make orthogonal to
At 0th order, the projection onto our Krylov subspace is
At 1st order, the projection onto this Krylov subspace is
This can be done by hand, but is most easily done using numpy:
import numpy as np
vstar = np.array([[1/np.sqrt(3),1/np.sqrt(3),1/np.sqrt(3)],[-1/np.sqrt(6),np.sqrt(2/3),-1/np.sqrt(6)]]
)
A = np.array([[1, 1, 0],
[1, 1, 1],
[0, 1, 1]])
v = np.array([[1/np.sqrt(3),-1/np.sqrt(6)],[1/np.sqrt(3),np.sqrt(2/3)],[1/np.sqrt(3),-1/np.sqrt(6)]])
proj = vstar@A@v
print(proj)
eigenvalues, eigenvectors = np.linalg.eig(proj)
print("The eigenvalues are ", eigenvalues)
print("The eigenvectors are ", eigenvectors)
outputs:
[[ 2.33333333 0.47140452]
[ 0.47140452 -0.33333333]]
The eigenvalues are [ 2.41421356 -0.41421356]
The eigenvectors are [[ 0.98559856 -0.16910198]
[ 0.16910198 0.98559856]]
The minimum eigenvalue estimate is -0.414.
1.2 Types of Krylov methods
"Krylov subspace methods" can refer to any of several iterative techniques used to solve large linear systems and eigenvalue problems. What they all have in common is that they construct an approximate solution from a Krylov subspace
where is the initial guess (see Ref [5]). They differ in how they choose the best approximation from this subspace, balancing factors such as convergence rate, memory usage, and overall computational cost. The focus of this lesson is to leverage quantum computing in the context of Krylov subspace methods; an exhaustive discussion of these methods is beyond its scope. The brief definitions below are for context only and include some references for investigating these methods further.
The conjugate gradient (CG) method: This method is used for solving symmetric, positive definite linear systems[6]. It minimizes the A-norm of the error at each iteration, making it particularly effective for systems arising from discretized elliptic PDEs[7]. We will use this approach in the next section to motivate why a Krylov subspace would be an effective subspace in which to probe for improved solutions to linear systems.
The generalized minimal residual (GMRES) method: This is designed for solving general nonsymmetric linear systems. It minimizes the residual norm over a Krylov space at each iteration, making it robust but potentially memory-intensive for large systems[7].
The minimal residual (MINRES) method: This method is used for solving symmetric indefinite linear systems. It's similar to GMRES but takes advantage of the matrix symmetry to reduce computational cost[8].
Other approaches of note include the full orthogonalization method (FOM), which is closely related to Arnoldi's method for eigenvalue problems, the bi-conjugate gradient (BiCG) method, and the induced dimension reduction (IDR) method.
1.3 Why the Krylov subspace method works
Here we will motivate that the Krylov subspace method should be an efficient way to approximate matrix eigenvalues via iterative refinement of eigenvector approximations, through the lens of steepest descent. We will argue that given an initial guess of a ground state, the space of successive corrections to that initial guess that yields the fastest convergence is a Krylov subspace. We stop short of a rigorous proof of convergence behavior.
Assume our matrix of interest is symmetric and positive definite. This makes our argument most relevant to the CG method above. We make no assumptions about sparsity here; nor are we claiming that must be a Hermitian (which it needs to be if it is a Hamiltonian).
We typically wish to solve a problem of the form
One might imagine that where is some constant, as in an eigenvalue problem. But our problem statement remains more general for now.
We start with a vector that is an approximate solution. Although there are parallels between this guess and in Section 1.1, we are not leveraging these here. Our guess has error, which we call
We also define the residual
Here we use capital to distinguish the residual from the dimension of our Krylov subspace .

We now want to make a correction step of the form
which we hope improves our approximation. Here is some vector yet to be determined. Let be the error after the correction is made. Then

We are interested in how our error behaves when transformed by our matrix. So let us calculate the -norm of the error. That is
where we have used the symmetry of and also that Here is some constant independent of . As mentioned in Section 1.2, the -norm of the error is not the only quantity we might choose to minimize, but it is a good one. We want to see how this quantity varies with our choice of correction vectors So we define the function by setting
is just the error as a function of the correction measured in the -norm. Hence, we want to choose such that