Hyperon decays present a promising alternative for extracting $\vert V_{us} \vert$ from lattice QCD combined with experimental measurements. Currently $\vert V_{us} \vert$ is determined from the kaon decay widths and a lattice calculation of the associated form factor. In this proceeding, I will present preliminary work on a lattice determination of the hyperon mass spectrum. I will additionally summarize future goals in which we will calculate the hyperon transition matrix elements, which will provide an alternative means for accessing $\vert V_{us} \vert$. This work is based on a particular formulation of SU(2) chiral perturbation theory for hyperons; determining the extent to which this effective field theory converges is instrumental in understanding the limits of its predictive power, especially since some hyperonic observables are difficult to calculate near the physical pion mass (e.g., hyperon-to-nucleon form factors), and thus the use of heavier than physical pion masses is likely to yield more precise results when combined with extrapolations to the physical point.
]]>Thank my contributors.
As we all know, quarks mix flavor under the weak interaction. One way this difference is manifested is through K, K-bar mixing, in which we see the quarks oscillate between flavors as shown in the box diagram on the right.
In the Standard Model, the difference between the quark eigenstates of the weak and strong interactions is encoded in the CKM matrix. If they had been the same, the matrix would’ve been diagonal. But they are not, so we have off-diagonal entries that allow mixing between different generations and flavors.
Notes:
It’s instructive to see how the CKM matrix element can be related to experiment. So, for example, consider leptonic pion decay. Once we’ve written down the transition matrix element, we can rewrite the hadronic matrix element (green) in terms of its form factor. Then by spin-averaging and integrating the transition matrix element over phase space, we can relate $F_\pi$ and $V_{ud}$ to the decay rate. Of course, if you’re more careful than I am, you can work out what “stuff” is.
Notes:
Experimentally, there are a few different setups one could use to determine $V_{ud}$, but the tightest determinations come from superallowed nuclear $\beta$ decay. The overall principle is more or less the same, however, and follows the procedure I outlined on the previous slide. Of course, depending on the decay process, the form factors will be different. For example, in the case of nuclear beta decay, there will be multiple form factors that must be accounted for.
Notice that the determinations from nuclear beta decay are currently limited by theory. Neutron beta decays are a promising alternative, since there is no nuclear structure correction that needs to be accounted for. Of course, this requries resolving the neutron lifetime puzzle.
Finally, I should add that the example I gave on the previous slide is
Add nuclear mirror description
Maybe explain less
Notes:
Determinations of $V_{us}$ are less precise since there’s no strange-quark equivalent of nuclear beta decay (at least, not at an amount that would be useful for calculating this quantity). Instead we determine $V_{us}$ from kaon decays, hyperon decays, or tau decays. Kaon decays require theory input by way of either $F_K/F_\pi$ (this work) or the form factor at 0 momentum transfer ($f^+(0)$). Either way, those observables are most precisely determined by lattice QCD.
Traditionally, one uses hyperon decays, which would be roughly the $V_{us}$ equivalent of neutron decays. However, hyperon decay estimates assume SU(3) flavor symmetry which is broken by roughly 15% in the baryon mass. Consequently, the error is greater for this source compared to using kaon decays. But as we will later see, it is possible to alternatively use the lattice to estimate this transition matrix element.
Finally, one could use hadronic $\tau$ decays to determine $V_{us}$. Compared to kaon decays, this has the advantage of not requiring a calculation of form factors, but there are assumptions in the calculation that are thought to break down. There is roughly a two-sigma discrepency when we $V_{us}$ determinations from kaon vs tau decays.
Using either leptonic or semileptonic kaon decay provides the most precise calculation for $V_{us}$. In this work, we use leptonic kaon decays rates in conjuction with a lattice determination of $F_K/F_\pi$ to determine $V_{us}$.
Notes:
Unlike the pion decay constant, this form factor is not immediately identified with a hadronic transition element from a particle state to the vacuum. Instead have $\langle P_2 | \overline d \gamma_\mu u | P_1 \rangle = f_+(q^2) p_\mu + f_-(q^2) q_\mu $ |
So if either $F_K/F_\pi$ or the 0-momentum form factor can be used to determine $V_{us}$, why do we use $F_K/F_\pi$?
In lattice QCD, $F_K/F_\pi$ is what we call a “gold-plated” quantity. Unlike many other QCD observables, it can be easily calculated to high-precision on the lattice for several reasons.
The primary advantage in using $F_K/F_\pi$ is that its dimensionless, which makes the calculation slightly easier. In short, the input to lattice calculations are dimensionless bare parameters, meaning the output is inherently dimensionless. In order to calculate a dimensionful quantity using lattice QCD, we need to set the scale by comparing a dimensionful quantity (like the mass of a baryon) to its “mass” on the lattice. Besides being tedious, scale setting adds a bit of extra uncertainty to the calculation.
The other advantage to $F_K/F_\pi$ over $f^+(0)$ is that its numerator and denominator are correlated, further improving statistics.
As for why $F_K/F_\pi$ is in general a convenient observable to calculate on the lattice compared to other generic observables, and therefore a good benchmark for comparing lattice actions, there are a couple more advantages.
First, the quantity is mesonic, so it doesn’t have the signal-to-noise issue associated with baryonic quantites. I’ll elaborate more on that later.
Second, as I will explain later, we determined $F_K/F_\pi$ by using our lattice data in conjunction with an effective field theory, in this case that effective field theory being chiral perturbation theory. The chiral expression is known to enough detail that our result is limited by lattice statistics, not theory. I’ll explain more of this later, too.
I’ve spend a bit of time explaining how lattice QCD allows us to calculate observables on the lattice, so it’s worth spending a minute explaining exactly what lattice QCD is.
Lattice QCD is a non-perturbative approach to QCD, which is particularly useful in the low-energy limit where the coupling constant becomes order 1. On the left, we have some picture of what that means. A proton – ostensibly, just three quarks – propagates through space. However, as the proton propagates, it interacts with the quark-gluon sea. In QED, you could draw a similar diagram for a proton propagating through space, but each vertex would be surpressed by a factor of 1/137 squared. In low temperature QCD, in contrast, the coupling is order 1, so the more complicated diagrams contribute just as much as the simple ones.
Of course, the practical implication here is that you can’t expand a path integral in terms of Feynman diagrams in low-temperature QCD. The lattice QCD approach, therefore, is to attempt to estimate that path integral directly. Since the path integral is infinite dimensional, that means instead of integrating the field values at every point in space time, we instead discretize the fields such that they can only exist at particular locations on a lattice.
Since the resulting integral is generally intractable, observables are then estimated by sampling field configurations.
Determining the energy of a particle on the lattice:
Notes:
By fitting the correlator, we can determine the ground state energy. However, simply plotting the correlator is not generally very useful – the exponential decay makes it difficult to check whether the fit “looks” good. Instead, we usually construct the effective mass.
Look at effective mass, wave function overlap to see how well fit works.
Explain variance of correlation function
dir/smr two different operators but with same gs – helps with fit
Snags:
On the right, we have an example of how we determine a good correlator fit.
Once we’ve determined a correlator fit, we want to extract some quantity from the fit’s posterior – in this case, the ground state energy
In order to convert our dimensionless quantities on the lattice into physical quantities, we must introduce a scale. We would like the scale to have certain properties:
Notes:
By definition, lattice calculations cannot be performed at the physical point since, as far as we know, space-time is not discretized. Or at least, even if it were discretized, the spacing would be many of orders of magnitude smaller than that of the finest lattice simulations.
So when we run these lattice simulations, we can only obtain answers at the physical point by extrapolating from our lattice data. To guide that extrapolation, we must make an Ansatz. Sometimes it is sufficient to just extrapolate in a Taylor series. But sometimes we can use effective field theory to guide us.
In an effective field theory, we trade the degrees of freedom of the underlying theory with whatever degrees of freedom are accessible at the energies we’re probing. At low temperatures, you don’t see free quarks, so it makes since to work with a different degree of freedom. In our case, we can treat the pseduoscalar mesons as the relevant degrees of freedom. This particular effective field theory is known as chiral pertubation theory, named so after the spontaneously broken chiral symmetry exhibited in QCD.
Now consider the plot of the effectve mass on the left. We could estimate the ground state by calculating the effective mass at a single time slice. However, we would have to balance concerns about precision with concerns about accuracy. At early times, the data is not accurate since excited states contaminate the ground state estimate. At late times, the data is noisy and we cannot precisely determine the ground state.
The compromise, of course, is to fit multiple time slices. Then you can use the more precise data to guide your fit to the limiting value.
Now consider the plot of $M_N/\Lambda_\chi$, which to avoid getting distracted by technical details, is basically a dimensionless version of $M_N$. We can use effective field theory to expand $F_K/F_\pi$ in the pion mass and lattice spacing. Clearly we cannot calculate $F_K/F_\pi$ at zero lattice spacing, so we must extrapolate from finite lattice spacing – the red, green, blue, and cyan bands – to the continuum limit.
Similarly, we generate our lattice data at multiple pion masses. Although we have calculated $F_K/F_\pi$ on the lattice near the physical pion mass, simulations near the physical point are much costlier for a vareity of technical reasons. In fact, depending on the calculation, the difference between a simulation at heavy pion mass with coarse lattice spacing and a simulation at physical pion mass with fine lattice spacing can be the difference between days and months of computational time on a high performance computer, so generating data away from the physical point is relatively cheap.
Furthermore, you need data on multiple lattices in order to determine the low energy constants of your effective field theory.
Notes:
Marciano [mar-see-an-o] has related $F_K/F_\pi$ and $|V_{us}|/|V_{ud}|$ to kaon/pion decay rates, so by combining our $F_K/F_\pi$ result with experimental results for $V_{ud}$ determined via superallowed nuclear beta decays, we can precisely determine $V_{us}$. The term in brackets is a radiative QED correction.
Here are the definitions of the pseudoscalar decay constants, which we generate on the lattice. Now we can see that the decay constants are aptly name, since we see the definition arises from the mesons decaying into the QCD vaccum. Of course, in real life pions and kaons decay weakly into other particles. So by applying the axial current on the pion state, the pion can decay to the QCD vacuum. The functional form on the right-hand side comes from matching the $\mu$ index – the momentum is the only Lorentz 4-vector available – and from noting that the form factor is a function of the $p^2$, which, since the meson is on-shell, is a constant. Hence pion decay constant.
The goal of this work is to determine $F_K/F_\pi$ at the physical point, that is, at the physical pion and kaon masses and in the continuum, infinite volume limit. To this end, we use chiral perturbation theory to expand $F_K/F_\pi$ in terms of the pseudoscalar meson masses.
At LO we expect $F_K/F_\pi = 1$ as this is the SU(3) flavor limit where kaons and pion are identical. The top row, therefore, offers corrections to $F_K/F_\pi$ to this difference. The terms in the bottom row are lattice artifacts that must be accounted for.
When we perform our extrapolation, we don’t limit ourselves to a single model. Instead we consider 24 different models and then take the model average. The 24 different models come from the following choices:
Notes:
Here we see the chiral expression for $F_K/F_\pi$ expanded to NLO. Despite this observable being expanded in terms of three variables, there’s only a single LEC that needs to be determined: $L_5$.
The chiral logs come from loops in a Feynman diagram.
This expansion has been worked out to N2LO too, and we have included the analytic expansion N2LO chiral expansion in our analysis. However, as we include more terms in our expansion, the expansion formula becomes significantly more complicated. For example, although the NLO expression is only two lines long, the analytic form for the loop corrections would span many pages when written out.
Rather than use the full expansion at N2LO, therefore, one might instead just using a Taylor expansion for the higher order terms. We considered both approaches in our analysis.
Because we’re fitting a chiral expansion, we need to determine the parameters in this expansion. At LO, there are no parameters to be determined since $F_K/F_\pi$ is 1. At NLO, there is only a single chiral LEC, assuming we Taylor-expand the ratio: the Gasser-Leutwyler [gawh-ser loot-why-lehr] constant $L_5$. But at higher orders, there are many more parameters. At N2LO, there are 11 more; and at N3LO, there are 6 more.
We use 18 different ensembles in our lattice calculation, each of which is a datapoint in our fit. So we have essentially 18 parameters to fit with only 18 datapoints. While a frequentist might deem the endeavor hopeless at this point, a Bayesian would not. We can constraint the parameters by assigning them prior distributions. And from the graph, we see the fit is improved even as we add more parameters to our fit: the widest band has only two parameters if we include a lattice spacing correction, but the narrowest band has as many parameters as we have data points.
We have a rough idea of what the size of our LECs should be based on
We have a rough idea of what the width of our parameters should be based on the size of our expansion parameters. Regardless, we can check whether our parameters are reasonable by using the empirical Bayes method, which uses the data to determine the most likely priors that would support that data.
To see how empirical Bayes works, consider a model of $F_K/F_\pi$. Usually when we think of models, we think of the analytic expressions describing some phenomenon – in this case, a chiral expression for $F_K/F_\pi$ to some order. But we can extend the idea of a model to also include the set of priors for the LECs, which I’ll denote with a capital $\Pi$. The chiral expression is then $f$.
Our goal is to find the set of priors for the LECs that is most likely given our chiral expression and data. Using Bayes theorem, we see that a prior is more likely to describe our model when either:
The green expression is also known as a hyperprior distrubtion since it is the prior distribution for the priors. The denominator is just a normalization factor.
In our case, we only use the empirical Bayes method to determine the appropriate widths for our priors. We expect our LECs to be order 1, but by probing multiple orders of magnitude, we can verify this hypothesis. Because we don’t typically know the sign of our LECs a priori, we usually set the central value of our priors to 0.
As an example, suppose we are confident in our priors for our chiral LECs, but we aren’t sure about the priors for the discretization terms. We consider the following candidates: all discretization LECs $0 \pm 0.1$, $0 \pm 1$, or $0 \pm 10$, that is we explore the appropriate prior width over three orders of magnitude.
Since we have no a priori reason for thinking any of these priors might be better than another, we can set the prior distribution for our priors (the green expression) to be uniform. Then the most likely prior, according to empirical Bayes, will be the prior that maximizes the likelihood function (the blue expression).
The likelihood function can be readily calculated – it just requires us to marginalize over our parameters. In general this is hard, but for strongly peaked distributions we can use Laplace’s method to approximate the integral numerically.
Finally, there’s one caveat I should mention. The scheme I’ve described assumes that the hyperprior distribution is uniform. But if you start fiddling with the candidate priors too much, that assumption is no longer a good one.
Again, we have 24 different candidate models to describe our data. We give each model a different weight in accordance to the model’s Bayes factor, which we then use to average each model’s extrapolation to the physical point. The Bayes factor is calculated by marginalizing over each of the model parameters and therefore allows us to compare models with different parameters. Additionally, it automatically penalizes overcomplicated models.
In this slide we see how our different model choices impact the model average.
In the top plot, we see that the data prefers $F_\pi$ for the cutoff. In the bottom right plot, we see that the the data heavily prefers using a pure Taylor expansion at N2LO, suggesting we have insufficient data to discern the full, analytic N2LO chiral expansion. In the last plot on the bottom right, we see that Taylor-expanding the ratio at NLO is not prefered.
While not shown here, models with and without the $\alpha_S$ correction have about equal weight.
Next we can break-down where our sources of error come from. The plot on the right largely reiterates what I said before, but there are a few additional things we can glean from it. For example, we have a single ensemble generated at $a=0.06$ fm. If we hadn’t generated this ensemble, our uncertainty would’ve slightly increased and our extrapolation would’ve shift down by roughly half a sigma.
Looking at our error budget, we find that the largest source of error came from statistics, and the second largest source came from discretization, giving us a clear path for improving our result: simply increase the number of configurations and ensembles.
Finally, since the up and down sea quarks are degenerate in our action, we also calculate an SU(2) isospin correction.
Comparing our result with other collaborations, we see that we are our result is in good agreement. The blue band is our result, and the green band is the FLAG average, which is essentially the lattice equivalent of the PDG.
Again, we emphasize that each of these groups is using a different lattice action. Our goal here isn’t to determine the most precise value of $F_K/F_\pi$ but to check that our action is behaving reasonably, in much the same way that experimentalists calculate the same quantity in different ways to check that their equipment is properly calibrated.
So while our result might not be the most precise, we have accomplished the goal we set out to do, which was to verify that our action yields reasonable results.
Finally, as I mentioned at the start of my presentation, we can use $F_K/F_\pi$ to determine $V_{us}$. Using Marciano’s relation, we get the red band. The blue band is the FLAG average for $V_{us}$, which was determined by a different method using semileptonic form factors and the Ademollo–Gatto [ah-di-mall-o gat-o] theorem.
The green band is the experimental result for $V_{ud}$ as determined by superallowed nuclear beta decays. The intersection of the green band and the red band, therefore, yields our determination of $V_{us}$. There’s a little bit of tension between our result and the FLAG average.
Finally, we calculate the unitarity condition for the CKM matrix mentioned before and find that our result supports it.
In conclusion, we can calculate $V_{us}$ from $F_K/F_\pi$, which allows us to test the unitarity condition of the CKM matrix. Further, $F_K/F_\pi$ is a gold-plated quantity, which we can use to compare lattice actions. We see that our action gives a result congruent with previous determinations of $F_K/F_\pi$.
Next we’d like to consider the hyperons, which provide a second way of extracting $V_{us}$.
First, what are hyperons and why do we care about them?
So why do we still need to study hyperons now?
So why bother with hyperon decays?
Why do we need the lattice?
Notes:
[Explain plot]
Notes:
See: https://arxiv.org/pdf/1910.07342.pdf
Not working in a vacuum
Mass spectrum:
Axial charges:
Vector form factors:
Notes:
Consider the S=2 hyperons
[Explain plot]:
[See slide]
Notes:
Notes:
As I previously stated, there are infinitely many ways of discretizing QCD, which should all converge in the 0 lattice spacing, infinite volume limit. Generically we can write the fermionic part of the lattice action in the following way [motions to equation]. The Dirac matrix, in this picture, is responsible for linking the lattice sites.
Here are some desired qualities a discretized action. The first two, locality and translational invariance, are requirements generic for any QFT. The latter two two are more particular to QCD.
Properties:
Unfortunately, the Nielsen-Ninomiya theorem says it’s impossible to simultaneously satisfy all these conditions in an even dimensional theory. But as we’ll see, it’s possible to circumvent this restriction by adding a fifth dimension to the theory.
Notes:
You might wonder why a lattice practitioner would be interested in the nucleon mass. After all, the mass of the proton and neutron are some of the most precisely determined quantities in physics. In fact, when written in MeV, the conversion factor from atomic mass units to MeV actually introduces the dominant source of error.
Notes:
Notes:
Notes:
[talk about effective mass plot, stability plot]
Notes:
The plot on the right shows that a flat extrapolation for the discretization terms reasonably describes the data
Note:
[Thank contributors, DoE]
You might wonder why a lattice practitioner would be interested in the nucleon mass. After all, the mass of the proton and neutron are some of the most precisely determined quantities in physics. In fact, when written in MeV, the conversion factor from atomic mass units to MeV actually introduces the dominant source of error.
Notes:
Notes:
Notes:
[talk about effective mass plot, stability plot]
Notes:
The plot on the right shows that a flat extrapolation for the discretization terms reasonably describes the data
Note:
Graphical user interface for lsqfit
using dash
.
See here for the documentation.
Run
pip install [-e] .
Either directly use a fit object to spawn a server
# some_script.py
from lsqfit import nonlinear_fit
from lsqfitgui import run_server
...
fit = nonlinear_fit(data, fcn=fcn, prior=prior)
run_server(fit)
or use the console script entry point pointing to a gvar
pickeled fit (and a fit function which is not stored in the pickled file)
#other_script.py
import gvar as gv
from lsqfit import nonlinear_fit
def fcn(x, p):
y = ...
return y
...
fit = nonlinear_fit(data, fcn=fcn, prior=prior)
gv.dump(fit, "fit.p")
and run
lsqfitgui [--function other_script.py:fcn] fit.p
Both commands will spawn a local server hosting the lsqfit interface.
It is possible to also set up fit meta information, i.e., allowing different fit models. See also the example
directory for more details.
The following script spawns the server which generated the above image
python example/entrypoint.py
Thank my contributors.
First, what are hyperons and why do we care about them?
So why do we still need to study hyperons now?
Why do we need the lattice?
Notes:
So why bother with hyperon decays?
Notes:
[Explain plot]
Notes:
Not working in a vacuum
Mass spectrum:
Axial charges:
Vector form factors:
Notes:
Consider the S=2 hyperons
[Explain plot]:
[See slide]
Notes:
Soon LHCb will have millions of hyperon semileptonic decays available for analysis. We propose to calculate transition form factors which, when combined with measurements of decay widths from LHCb, will be used to determine the Cabibbo–Kobayashi–Maskawa (CKM) matrix element $V_{us}$. Along the way, we will also calculate the hyperon mass spectrum and axial charges as a test of baryon chiral perturbation theory, which will serve as the framework for the form factor calculations.
]]>Python code for our scale setting analysis.
This repository performs the chiral, continuum and infinite volume extrapolations of w_0 m_Omega
to perform a scale setting on the MDWF on gradient-flowed HISQ action. The present results accompany the scale setting publication available at arXiv:2011.12166.
The analysis was performed by Nolan Miller (millerb) with the master
branch, and Logan Carpenter (
loganofcarpenter) with cross checks by André Walker-Loud (walkloud) on the andre
branch.
The raw correlation functions can be found here and the bootstrap results for the ground state masses and values of Fpi
are contained in the file data/omega_pi_k_spec.h5
.
To generate the extrapolation and interpolation results from the paper, run python scale-setting.py -c [name]
. This will automatically create the folder /results/[name]/
. A summary of the results is given inside /results/[name]/README.md
. Extra options can be viewed by running python scale-setting.py --help
, which is given below for convenience.
usage: scale-setting.py [-h] [-c COLLECTION_NAME] [-m MODELS [MODELS ...]] [-ex EXCLUDED_ENSEMBLES [EXCLUDED_ENSEMBLES ...]] [-em {all,order,disc,alphas}] [-df DATA_FILE] [-re] [-mc] [-nf] [-na] [-d]
Perform scale setting
optional arguments:
-h, --help show this help message and exit
-c COLLECTION_NAME, --collection COLLECTION_NAME
fit with priors and models specified in /results/[collection]/{prior.yaml,settings.yaml} and save results
-m MODELS [MODELS ...], --models MODELS [MODELS ...]
fit specified models
-ex EXCLUDED_ENSEMBLES [EXCLUDED_ENSEMBLES ...], --exclude EXCLUDED_ENSEMBLES [EXCLUDED_ENSEMBLES ...]
exclude specified ensembles from fit
-em {all,order,disc,alphas}, --empirical_priors {all,order,disc,alphas}
determine empirical priors for models
-df DATA_FILE, --data_file DATA_FILE
fit with specified h5 file
-re, --reweight use charm reweightings on a06m310L
-mc, --milc use milc's determinations of a/w0
-nf, --no_fit do not fit models
-na, --no_average do not average models
-d, --default use default priors; defaults to using optimized priors if present, otherwise default priors
To fine-tune the results, either re-run the fits using the options above or by modifying /results/[name]/settings.yaml
. Similarly, the fits can be constructed with different priors by editing /results/[name]/priors.yaml
and re-running python scale-setting.py -c [name]
.
In addition to this library, this repo contains Juypyter notebooks. The fit for a single model can be explored in /notebooks/fit_model.ipynb
. The model average is provided in /notebooks/average_models.ipynb
. Some miscellaneous drudgery (eg, the paper’s sensitivity figure) is available in /notebooks/bespoke_plots.ipynb
.
This work makes extensive use of Peter Lepage’s Python modules gvar
and lsqfit
, which are used to construct the fits and model average. Further, the settings and priors are primarily tweaked by the accompanying yaml
files loaded via PyYAML
.
We report on a sub-percent scale determination using the Omega baryon mass and gradient-flow methods. The calculations are performed on 22 ensembles of $N_f=2+1+1$ highly improved, rooted staggered sea-quark configurations generated by the MILC and CalLat Collaborations. The valence quark action used is Möbius Domain-Wall fermions solved on these configurations after a gradient-flow smearing is applied with a flowtime of $t_{\rm gf}=1$ in lattice units. The ensembles span four lattice spacings in the range $0.06 \lesssim a \lesssim 0.15$ fm, six pion masses in the range $130 \lesssim m_\pi \lesssim 400$ MeV and multiple lattice volumes. On each ensemble, the gradient-flow scales $t_0/a^2$ and $w_0/a$ and the omega baryon mass $a m_\Omega$ are computed. The dimensionless product of these quantities is then extrapolated to the continuum and infinite volume limits and interpolated to the physical light, strange and charm quark mass point in the isospin limit, resulting in the determination of $\sqrt{t_0} = 0.1422(14)$ fm and $w_0 = 0.1709(11)$ fm with all sources of statistical and systematic uncertainty accounted for. The dominant uncertainty in this result is the stochastic uncertainty, providing a clear path for a few-per-mille uncertainty, as recently obtained by the Budapest-Marseille-Wuppertal Collaboration.
]]>Today I’m here to talk about my lattice determination of the ratio of the pseudoscalar decay rates $F_K$ and $F_\pi$ using a mixed domain-wall on HISQ action, which was only possible due to the work of other members of CalLat.
As we all know, the quark eigenstates of the weak and strong interactions are different. One way this difference is manifested is through K, K-bar mixing, in which we see the quarks oscillate between flavors.
In the Standard Model, the difference between the quark eigenstates of the weak and strong interactions is encoded in the CKM matrix. If the eigenstates has been the same, the matrix would’ve been diagonal. However, the eigenstates are not, so we have off-diagonal entries that allow mixing between different generations and flavors.
According to the Standard Model, the CKM matrix is unitary. From the top row of the CKM matrix, we get the following relation. The CKM matrix entry $V_{ud}$ can be precisely determined experimentally through superallowed beta decays; however, $V_{us}$ cannot and must instead be determined through lattice methods. The last entry in this relation, $V_{ub}$, is comparably small, so this equation predominantly relates $V_{ud}$ and $V_{us}$.
Marciano [mar-see-an-o] has related $F_K/F_\pi$ and $\vert V_{us} \vert/\vert V_{ud} \vert$ to kaon/pion decay rates, so by combining our $F_K/F_\pi$ result with experimental results for $V_{ud}$ determined via superallowed nuclear beta decays, we can precisely determine $V_{us}$.
Here are the definitions of the pseudoscalar decay constants, which we can use to generate values on the lattice.
As previously stated, $V_{us}$ is easily accessed by lattice, not experiments, especially if you have a few hundred million dollars to spare.
So what is lattice QCD? Lattice QCD is a non-pertubative approach to QCD, which is particularly useful in the low-energy limit where the coupling constant becomes greater than 1. The basic idea behind lattice QCD is to imagine what would happen if quark and gluon fields were discretized to a lattice, rather than permitting them to lie anywhere in spacetime, and then considering the limit where the lattice spacing goes to 0. Perhaps unsurprisingly, there are infinitely many ways of discretizing the QCD action, but they aren’t all equally useful.
In contrast with experimentalists, lattice practitioners have the advantage of being able to tune QCD parameters, allowing us to perform lattice “experiments” in “alternative” universes, thereby probing how QCD observables are impacted by their underlying parameters.
Lattice methods can also be used in conjunction with effective field theory, increasing the precision of our results.
$F_K/F_\pi$ is a ``gold-plated” quantity. Unlike many other QCD observables, it can be easily calculated to high-precision on the lattice. The quantity is dimensionless, meaning we don’t have to worry about scale setting. The numerator and denominator are correlated, further improving statistics. The quantity is mesonic, so it doesn’t have the signal-to-noise issue associated with baryonic observables. And the full chiral expansion is known to NNLO, so we’re only limited by our statistics, not theory.
As I previously stated, there are infinitely many ways of discretizing QCD. We use a mixed action, which is to say that we discretize the sea and valence quarks differently. Since the sea quarks are generally less important than the valence quarks, we use an action that allows us to cheaply produce many field configurations. This also means we can use generate additional pion and kaon data for the same amount of computational resources. Our action, unlike some others, has no $O(a)$ discretization errors.
The goal of this work is to determine $F_K/F_\pi$ at the physical point, that is, at the physical pion and kaon masses and in the continuum, infinite volume limit. To this end, we use chiral perturbation theory to expand $F_K/F_\pi$ in terms of the pseudoscalar masses.
At LO we expect $F_K/F_\pi = 1$ as this is the SU(3) flavor limit. In the SU(3) flavor limit, kaons and pion are identical. The top row, therefore, offers corrections to $F_K/F_\pi$ via $\chi$PT. The terms in the bottom row are lattice artifacts that must be accounted for.
When we perform our extrapolation, we don’t limit ourselves to a single model. Instead we consider 24 different models and then take the model average. The 24 different models come from the following choices:
Because we’re fitting a chiral expansion, we need to determine the parameters in this expansion. At LO, there are no parameters to be determined since $F_K/F_\pi$ is 1. At NLO, there is only a single chiral LEC, assuming we Taylor-expand the ratio: the Gasser-Leutwyler [gawh-ser loot-why-lehr] constant $L_5$. But at higher orders, there are many more parameters. At N2LO, there are 11 more; and at N3LO, there are 6 more.
We use 18 different ensembles in our lattice calculation, each of which is a datapoint in our fit. So we have essentially 18 parameters to fit with only 18 datapoints. While a frequentist might deem the endeavor hopeless at this point, a Bayesian would not. We can constraint the parameters by assigning them prior distributions. And from the graph, we see the fit is improved even as we add more parameters to our fit: the widest band has only two parameters if we include a lattice spacing correction, but the narrowest band has as many parameters as we have data points.
We have a rough idea of what the width of our parameters should be based on the size of our expansion parameters. Regardless, we can check whether our parameters are reasonable by using the empirical Bayes method, which uses the data to determine the most likely priors that would support that data.
Again, we have 24 different candidate models to describe our data. We give each model a different weight in accordance to the model’s Bayes factor, which we then use to average each model’s extrapolation to the physical point. The Bayes factor is calculated by marginalizing over each of the model parameters and therefore allows us to compare models with different parameters. Additionally, it automatically penalizes overcomplicated models.
In this slide we see how our different model choices impact the model average.
In the top plot, we see that the data prefers $F_\pi$ for the cutoff. In the bottom right plot, we see that the the data heavily prefers using a pure Taylor expansion at N2LO, suggesting we have insufficient data to discern the N2LO chiral logs. In the last plot on the bottom right, we see that Taylor-expanding the ratio at NLO is not prefered.
While not shown here, models with and without the $\alpha_S$ correction have about equal weight.
Next we can break-down where our sources of error come from. The plot on the right largely reiterates what I said before, but there are a few additional things we can glean from it. For example, we have a single ensemble generated at $a=0.06$ fm. If we hadn’t generated this ensemble, our uncertainty would’ve slightly increased and our extrapolation would’ve shift down by roughly half a sigma.
Looking at our error budget, we find that the largest source of error came from statistics, and the second largest source came from discretization, giving us a clear path for improving our result: simply increase the number of configurations and ensembles.
Finally, the up and down sea quarks are degenerate in our action, we also calculate an SU(2) isospin correction.
Comparing our result with other collaborations, we see that we are our result is in good agreement. The blue band is our result, and the green band is the FLAG average, which is essentially the lattice equivalent of the PDG.
Again, we emphasize that each of these groups is using a different lattice action. Our goal here isn’t to determine the most precise value of $F_K/F_\pi$ but to check that our action is behaving reasonably, in much the same way that experimentalists calculate the same quantity in different ways to check that their methods are valid.
So while our result might not be the most precise, we have accomplished the goal we set out to do, which was to verify that our action yields reasonable results.
Finally, as I mentioned at the start of my presentation, we can use $F_K/F_\pi$ to determine $V_{us}$. Using Marciano’s relation, we get the red band. The blue band is the FLAG average for $V_{us}$, which was determined by a different method using semileptonic form factors and the Ademollo–Gatto [ah-di-mall-o gat-o] theorem.
The green band is the experimental result for $V_{ud}$ as determined by superallowed nuclear beta decays. The intersection of the green band and the red band, therefore, yields our determination of $V_{us}$. There’s a little bit of tension between our result and the FLAG average.
Finally, we calculate the unitarity condition for the CKM matrix mentioned before and find that our result supports it.
In conclusion, we can calculate $V_{us}$ from $F_K/F_\pi$, which allows us to test the unitarity condition of the CKM matrix. Further, $F_K/F_\pi$ is a gold-plated quantity, which we can use to compare lattice actions. We see that our action gives a result congruent with previous determinations of $F_K/F_\pi$. Finally, we see that model averaging is a method that allows us to evaluate the fitness of many models without biasing our result by commiting to a single one.
I’d like to once again thank my collaborators in CalLat. Thanks for listening!
]]>A python noteboook for plotting points and lines, expressly written for making spacetime diagrams. To get started with a tutorial, launch the binder instance of the notebook.
The purpose of this repo is to abstract away the details of creating a plot in matplotlib
, a module which is notoriously tricky for first-time users (especially those with little or no programming experience). For instance, per the documentation, here is about the simplest plot one can make using matplotlib
.
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
# Data for plotting
t = np.arange(0.0, 2.0, 0.01)
s = 1 + np.sin(2 * np.pi * t)
fig, ax = plt.subplots()
ax.plot(t, s)
ax.set(xlabel='time (s)', ylabel='voltage (mV)',
title='About as simple as it gets, folks')
ax.grid()
plt.show()
A simple matplotlib plot |
This “simple example”, however, is already too complicated for our purposes – we’re only interested in plotting lines and points. To that end, let’s plot a couple line. Recall that a line can be specified by either (1) a pair of points or (2) a point and a line. Suppose we’re interested in plotting two lines with slope $-0.5$, one passing through $A = (1, 2)$ and the other passing through $B = (1, -1)$. Using matplotlib
, we cannot directly plot this line – instead, we must convert this description of a line into a collection of points, which can then be plotted via matplotlib.pyplt.plot
.
Using either the code in this repo or matplotlib
, we should get the following plot.
Two lines, a grid, and a legend |
Here is the matplotlib implementation.
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
# Determine range
xlim = (-5, 5)
# Line a
m_a = -0.5
A = (1, 2)
label_a = 'a'
# Line b
m_b = -0.5
B = (1, -1)
label_b = 'b'
# Convert description to arrays
for label, m, P in zip([label_a, label_b], [m_a, m_b], [A, B]):
b = P[1] - m *P[0]
x = np.linspace(xlim[0], xlim[1])
y = m *x + b
# Actually plot the line
plt.plot(x, y, label=label, lw=2)
# Format plot to look nice
plt.xlim(xlim)
plt.ylim(xlim)
plt.gca().set_aspect('equal')
plt.minorticks_on()
plt.tick_params(direction='in', which='both')
plt.grid(which='major', alpha=0.7)
plt.grid(which='minor', alpha=0.2)
plt.legend(bbox_to_anchor=(1, 1), loc='upper left', prop={'size': 16})
plt.axhline(0, ls='--')
plt.axvline(0, ls='--')
plt.grid()
plt.show()
Compare with the much simpler implementation in this repo.
from plotter import plot_diagram
# Change these
xlim = (-5, 5)
lines = [
('a', (1, 2), -0.5), # line a
('b', (1, -1), -0.5), # line b
]
# Make plot
plot_diagram(lines, xlim)
Of course, there are additional wrinkles to a matplotlib
approach that must be accounted for (plotting vertical lines, specifying a line by two points instead, plotting a single point). The matplotlib
wrapper plot_diagram
takes care of those issues for us.