$V_{us}$ from the lattice [talk]

Los Alamos National Lab [invited talk] [PDF of slides]

Title slide

Title

Thank my contributors.

Flavor conservation?

As we all know, quarks mix flavor under the weak interaction. One way this difference is manifested is through K, K-bar mixing, in which we see the quarks oscillate between flavors as shown in the box diagram on the right.

In the Standard Model, the difference between the quark eigenstates of the weak and strong interactions is encoded in the CKM matrix. If they had been the same, the matrix would’ve been diagonal. But they are not, so we have off-diagonal entries that allow mixing between different generations and flavors.

Tow-row CKM unitarity

  • According to the Standard Model, CKM matrix should be unitary
  • From this condition, we can get many relations between the CKM matrix elements
  • We will concentrate on one particular relation between the top row
  • [explain different top-row elements]
  • [explain graph on left] – Wolfenstein parameterization
  • Light green hyperbola comes from $V_{us}$

Notes:

  • Hyperbola source: https://arxiv.org/abs/hep-lat/0509046

Example: $V_{ud}$ from pion decay ($\pi \rightarrow l \, \overline{\nu}_l$)

It’s instructive to see how the CKM matrix element can be related to experiment. So, for example, consider leptonic pion decay. Once we’ve written down the transition matrix element, we can rewrite the hadronic matrix element (green) in terms of its form factor. Then by spin-averaging and integrating the transition matrix element over phase space, we can relate $F_\pi$ and $V_{ud}$ to the decay rate. Of course, if you’re more careful than I am, you can work out what “stuff” is.

Notes:

  • $G_F$: weak-interaction coupling constant. Can be related to QCD/the neutron decay constant by the Goldberger–Treiman relation, for example.
  • Relate spin-averaged, phase-space integrated transition matrix element to decay rate
  • $d \Phi$: phase space
  • Also need to include radiative QED corrections
  • Decay into electrons is helicity-supressed, making decays into muons more favorable
  • $V_{km}$ in the product of two unitary matrices which transform the quark field between the mass and weak bases
  • $\gamma^\mu (1 - \gamma^5)$ projects out the left-handed (helicity) fields

Experimental determination of $|V_{ud}|$

Experimentally, there are a few different setups one could use to determine $V_{ud}$, but the tightest determinations come from superallowed nuclear $\beta$ decay. The overall principle is more or less the same, however, and follows the procedure I outlined on the previous slide. Of course, depending on the decay process, the form factors will be different. For example, in the case of nuclear beta decay, there will be multiple form factors that must be accounted for.

Notice that the determinations from nuclear beta decay are currently limited by theory. Neutron beta decays are a promising alternative, since there is no nuclear structure correction that needs to be accounted for. Of course, this requries resolving the neutron lifetime puzzle.

Finally, I should add that the example I gave on the previous slide is

Add nuclear mirror description

Maybe explain less

Notes:

  • See this paper for more details: [Towner and Hardy, 2010]
  • Experiments: use comparative half-life instead of half-life (only depends on nuclear matrix element)
  • Nuclear: neutron decays in atom
    • Superallowed: wf nearly identical after decay (total spin, parity unchanged), low comparative half-life, decays very quickly, isobaric analogue states (same atomic weight). In terms of selection rules, it’s a highly favorable decay
    • Most precise in nuclear decays: radiative QED uncertainty dominates
  • Neutron: neutron to proton
    • Two entries: beam vs bottle experiments give statistically different results. Left is range, right is average. (Has this been resolved?)
    • Neutron multiple form factors: vector and axial currents must be accounted for
  • Nuclear mirrors: mass numbner (number of protons + neutrons) conserved
  • pion: rare semileptonic decay, different from previous example; most of the uncertainty comes from the branching ratio
    • leptonic pion decay: uncertainty from form factor leads to leptonic determinations having about 3 times the uncertainty of their semileptonic counterparts and forty times wider when compared to superallowed nuclear beta decay. [PDG 2019]
  • Nuclear correction: from nuclear structure, isospin-breaking

Experimental determination of $|V_{us}|$?

Determinations of $V_{us}$ are less precise since there’s no strange-quark equivalent of nuclear beta decay (at least, not at an amount that would be useful for calculating this quantity). Instead we determine $V_{us}$ from kaon decays, hyperon decays, or tau decays. Kaon decays require theory input by way of either $F_K/F_\pi$ (this work) or the form factor at 0 momentum transfer ($f^+(0)$). Either way, those observables are most precisely determined by lattice QCD.

Traditionally, one uses hyperon decays, which would be roughly the $V_{us}$ equivalent of neutron decays. However, hyperon decay estimates assume SU(3) flavor symmetry which is broken by roughly 15% in the baryon mass. Consequently, the error is greater for this source compared to using kaon decays. But as we will later see, it is possible to alternatively use the lattice to estimate this transition matrix element.

Finally, one could use hadronic $\tau$ decays to determine $V_{us}$. Compared to kaon decays, this has the advantage of not requiring a calculation of form factors, but there are assumptions in the calculation that are thought to break down. There is roughly a two-sigma discrepency when we $V_{us}$ determinations from kaon vs tau decays.

Using either leptonic or semileptonic kaon decay provides the most precise calculation for $V_{us}$. In this work, we use leptonic kaon decays rates in conjuction with a lattice determination of $F_K/F_\pi$ to determine $V_{us}$.

Notes:

  • $\tau$ theory problems: Finite Energy Sum Rule assumptions break down, leads to a $2\sigma$ discrepency.
  • Leptonic vs semi-leptonic are about equally precise, though they disagree by about $1\sigma$
  • $f^+(0)$:
    • To understand what this term is, consider the leptonic pion decay example from before. In that case, there was only a single form factor that needed to be determined ($F_\pi$). In the case of semileptonic kaon/pion decays, there are two form factors to be determined due to the extra particle, which can be reduced to a single form factor through some assumptions
    • Assumption (conserved vector current hypothesis): charge-changing weak vector current = charge-convserving electromagnetic vector current + rotation in isospin space);
    • Unlike the pion decay constant, this form factor is not immediately identified with a hadronic transition element from a particle state to the vacuum. Instead have $\langle P_2 \overline d \gamma_\mu u P_1 \rangle = f_+(q^2) p_\mu + f_-(q^2) q_\mu $
  • tau exclusive: tau -> {pion, kaon} + neutrino
  • tau inclusive: tau -> sum of all possible hadronic states + neutrino
  • tau theory problems: relies on finite energy sum rule assumptions; error is probably underestimated
  • $V_{us}$ from tau can also be estimated using exclusive tau decays + $F_K/F_\pi$
  • Historically $V_{us}$ was determined from hyperon decays but assuming SU(3) flavor symmetry (broken by ~15%). This SU(3) flavor assumption allowed one to relate the form factors of the electric charge and magnetic moment of the baryons.

Why $F_K/F_\pi$ via lattice QCD?

So if either $F_K/F_\pi$ or the 0-momentum form factor can be used to determine $V_{us}$, why do we use $F_K/F_\pi$?

In lattice QCD, $F_K/F_\pi$ is what we call a “gold-plated” quantity. Unlike many other QCD observables, it can be easily calculated to high-precision on the lattice for several reasons.

The primary advantage in using $F_K/F_\pi$ is that its dimensionless, which makes the calculation slightly easier. In short, the input to lattice calculations are dimensionless bare parameters, meaning the output is inherently dimensionless. In order to calculate a dimensionful quantity using lattice QCD, we need to set the scale by comparing a dimensionful quantity (like the mass of a baryon) to its “mass” on the lattice. Besides being tedious, scale setting adds a bit of extra uncertainty to the calculation.

The other advantage to $F_K/F_\pi$ over $f^+(0)$ is that its numerator and denominator are correlated, further improving statistics.

As for why $F_K/F_\pi$ is in general a convenient observable to calculate on the lattice compared to other generic observables, and therefore a good benchmark for comparing lattice actions, there are a couple more advantages.

First, the quantity is mesonic, so it doesn’t have the signal-to-noise issue associated with baryonic quantites. I’ll elaborate more on that later.

Second, as I will explain later, we determined $F_K/F_\pi$ by using our lattice data in conjunction with an effective field theory, in this case that effective field theory being chiral perturbation theory. The chiral expression is known to enough detail that our result is limited by lattice statistics, not theory. I’ll explain more of this later, too.

Brief overview of lattice QCD

I’ve spend a bit of time explaining how lattice QCD allows us to calculate observables on the lattice, so it’s worth spending a minute explaining exactly what lattice QCD is.

Lattice QCD is a non-perturbative approach to QCD, which is particularly useful in the low-energy limit where the coupling constant becomes order 1. On the left, we have some picture of what that means. A proton – ostensibly, just three quarks – propagates through space. However, as the proton propagates, it interacts with the quark-gluon sea. In QED, you could draw a similar diagram for a proton propagating through space, but each vertex would be surpressed by a factor of 1/137 squared. In low temperature QCD, in contrast, the coupling is order 1, so the more complicated diagrams contribute just as much as the simple ones.

Of course, the practical implication here is that you can’t expand a path integral in terms of Feynman diagrams in low-temperature QCD. The lattice QCD approach, therefore, is to attempt to estimate that path integral directly. Since the path integral is infinite dimensional, that means instead of integrating the field values at every point in space time, we instead discretize the fields such that they can only exist at particular locations on a lattice.

Since the resulting integral is generally intractable, observables are then estimated by sampling field configurations.

Measuring observables on the lattice

Determining the energy of a particle on the lattice:

  • Let $O^\dagger$ be the operator that creates a particle with the desired quantum numbers (eg, a Delta baryon)
  • By inserting a complete set of states, we can rewrite the correlator in terms of its energy eigenstates

Notes:

  • Values on the lattice have no dimension
  • Need scale setting to convert to dimensionful units
  • Lattices are not at the physical point (eg, pion mass could be 350 MeV)

Measuring observables on the lattice (1/2)

By fitting the correlator, we can determine the ground state energy. However, simply plotting the correlator is not generally very useful – the exponential decay makes it difficult to check whether the fit “looks” good. Instead, we usually construct the effective mass.

Look at effective mass, wave function overlap to see how well fit works.

Explain variance of correlation function

dir/smr two different operators but with same gs – helps with fit

Snags:

  • Excited state contaimination at early times
  • Noise grows at later times
  • Data selection problem: need to determine range of data for a given fit to $N$ states

Measuring observables on the lattice (2/2)

On the right, we have an example of how we determine a good correlator fit.

  • As I mentioned before, we have a data selection problem – fitting all the data will result in a poor fit since the late times are essentially pure noise and the early times are contaiminated by excited states
  • However, since the late times are so noisy, determining a good fit mostly depends on determining the number of excited states and starting time
  • There are a few techniques we can use:
    • Look for a plateau
    • Check Q values
    • Check Bayes factors (bottom axis) – these are Bayesian fits, so we can directly compare 2 state and 3 state fits, for example

Once we’ve determined a correlator fit, we want to extract some quantity from the fit’s posterior – in this case, the ground state energy

  • However, all values on the lattice are dimensionless, so any quantity you extract is really going to be the product of observables that have mass dimension 0 – in this example, the dimensionless quantity is $aM$
  • But if we know the lattice spacing, we can extract the
  • There’s only one problem with this approach – we generally don’t know the lattice spacing a priori! We must therefore first solve the inverse problem of determining the lattice spacing before we can determine the mass in physical units

Scale setting with $w_0$ \& $M_\Omega$ (1/2)

In order to convert our dimensionless quantities on the lattice into physical quantities, we must introduce a scale. We would like the scale to have certain properties:

  • [see list] In our work, we use the gradient-flow derived scale $w_0$. This observable is related to the energy density of the gauge fields. However, what it actually is doesn’t matter so much as to what it actually does – no experimentalist actually needs to measure this observable when we can determine it precisely on the lattice. As you can see from the plot on the right, $w_0$, which is related to a derivative of the plotted variable, is incredibly flat and can be precisely measured on the lattice.

Notes:

  • $w_0$ is defined in terms of a flow equation of the gauge field. Here $w_{0,\text{orig}}$ is the tree-level relation

Scale setting with $w_0$ \& $M_\Omega$ (2/2)

  • Show how we determine $w_0^*$
  • Show interpolation $a/w_0$ per lattice spacing at phys pion mass
  • Show example calculation of a mass on some ensemble

Lattice QCD & effective field theory

By definition, lattice calculations cannot be performed at the physical point since, as far as we know, space-time is not discretized. Or at least, even if it were discretized, the spacing would be many of orders of magnitude smaller than that of the finest lattice simulations.

So when we run these lattice simulations, we can only obtain answers at the physical point by extrapolating from our lattice data. To guide that extrapolation, we must make an Ansatz. Sometimes it is sufficient to just extrapolate in a Taylor series. But sometimes we can use effective field theory to guide us.

In an effective field theory, we trade the degrees of freedom of the underlying theory with whatever degrees of freedom are accessible at the energies we’re probing. At low temperatures, you don’t see free quarks, so it makes since to work with a different degree of freedom. In our case, we can treat the pseduoscalar mesons as the relevant degrees of freedom. This particular effective field theory is known as chiral pertubation theory, named so after the spontaneously broken chiral symmetry exhibited in QCD.

Now consider the plot of the effectve mass on the left. We could estimate the ground state by calculating the effective mass at a single time slice. However, we would have to balance concerns about precision with concerns about accuracy. At early times, the data is not accurate since excited states contaminate the ground state estimate. At late times, the data is noisy and we cannot precisely determine the ground state.

The compromise, of course, is to fit multiple time slices. Then you can use the more precise data to guide your fit to the limiting value.

Now consider the plot of $M_N/\Lambda_\chi$, which to avoid getting distracted by technical details, is basically a dimensionless version of $M_N$. We can use effective field theory to expand $F_K/F_\pi$ in the pion mass and lattice spacing. Clearly we cannot calculate $F_K/F_\pi$ at zero lattice spacing, so we must extrapolate from finite lattice spacing – the red, green, blue, and cyan bands – to the continuum limit.

Similarly, we generate our lattice data at multiple pion masses. Although we have calculated $F_K/F_\pi$ on the lattice near the physical pion mass, simulations near the physical point are much costlier for a vareity of technical reasons. In fact, depending on the calculation, the difference between a simulation at heavy pion mass with coarse lattice spacing and a simulation at physical pion mass with fine lattice spacing can be the difference between days and months of computational time on a high performance computer, so generating data away from the physical point is relatively cheap.

Furthermore, you need data on multiple lattices in order to determine the low energy constants of your effective field theory.

Notes:

  • Noise depends on m_baryon - 3/2 m_pi: this ratio decreases as the pion mass increases
  • Dirac operator ill-conditioned (Condition number, ill-conditioned matrix): relevant when inverting dirac operator since the lattice spacing determines highest eigenvalue & pion mass determines lowest eigenvalue

$V_{us}$ from $F_K/F_\pi$

Marciano [mar-see-an-o] has related $F_K/F_\pi$ and $|V_{us}|/|V_{ud}|$ to kaon/pion decay rates, so by combining our $F_K/F_\pi$ result with experimental results for $V_{ud}$ determined via superallowed nuclear beta decays, we can precisely determine $V_{us}$. The term in brackets is a radiative QED correction.

Here are the definitions of the pseudoscalar decay constants, which we generate on the lattice. Now we can see that the decay constants are aptly name, since we see the definition arises from the mesons decaying into the QCD vaccum. Of course, in real life pions and kaons decay weakly into other particles. So by applying the axial current on the pion state, the pion can decay to the QCD vacuum. The functional form on the right-hand side comes from matching the $\mu$ index – the momentum is the only Lorentz 4-vector available – and from noting that the form factor is a function of the $p^2$, which, since the meson is on-shell, is a constant. Hence pion decay constant.

$F_K/F_\pi$ models

The goal of this work is to determine $F_K/F_\pi$ at the physical point, that is, at the physical pion and kaon masses and in the continuum, infinite volume limit. To this end, we use chiral perturbation theory to expand $F_K/F_\pi$ in terms of the pseudoscalar meson masses.

At LO we expect $F_K/F_\pi = 1$ as this is the SU(3) flavor limit where kaons and pion are identical. The top row, therefore, offers corrections to $F_K/F_\pi$ to this difference. The terms in the bottom row are lattice artifacts that must be accounted for.

When we perform our extrapolation, we don’t limit ourselves to a single model. Instead we consider 24 different models and then take the model average. The 24 different models come from the following choices:

  1. At NLO, whether we (a) use the NLO expressions for $F_K$ and $F_\pi$ in the numerator and denominator or (b) take the taylor expansion of $F_K$ and $F_\pi$. It sounds pedantic, but the latter choice removes one LEC at NLO.
  2. At N2LO, whether we use the full $\chi$PT expression, which includes chiral logs, or just use a taylor expansion. Regardless, the N3LO correction is just a taylor series correction.
  3. What we use for our renormalization/chiral cutoff. $F_K$, $F_\pi$ the same at some order in xpt – the difference gives us an idea of the error from cutting off at some order
  4. Whether or not we include the $\alpha_S$ term, which is a lattice correction from radiative gluons and is a quirk particular to some action discretizations.

Notes:

  • Why $\Lambda_chi$? Why can we make those choices?
  • Cutoff sets scale of effective field theory – degrees of freedom change when there’s enough energy to create new particles not described by our theory (the rho => $m_\rho \approx 770$ MeV)
  • Why can we use $F$ instead of rho? Approximately the mass of the rho, $F$ more convenient (appears often in xpt)

Example: NLO model

Here we see the chiral expression for $F_K/F_\pi$ expanded to NLO. Despite this observable being expanded in terms of three variables, there’s only a single LEC that needs to be determined: $L_5$.

The chiral logs come from loops in a Feynman diagram.

This expansion has been worked out to N2LO too, and we have included the analytic expansion N2LO chiral expansion in our analysis. However, as we include more terms in our expansion, the expansion formula becomes significantly more complicated. For example, although the NLO expression is only two lines long, the analytic form for the loop corrections would span many pages when written out.

Rather than use the full expansion at N2LO, therefore, one might instead just using a Taylor expansion for the higher order terms. We considered both approaches in our analysis.

  • Smoothly goes from NLO to N2LO – doesn’t look like overfitting
  • No evidence for large chiral logs

Lattice details

  • For our particular lattice calculations, we employ lattices with pion masses ranging from the physical point value (about 130 MeV) to 400 MeV
  • Our lattice spacings range from 0.06 fm (purple) to 0.15 fm (red)
  • For those more familiar with lattice parlance, we use a mixed action, with domain wall fermions in the valence sector and highly-improved staggered quarks in the sea sector
  • Our gauge configurations are provided by MILC (thanks!)

Example fit

  • [Explain fit]

Model Parameters

Because we’re fitting a chiral expansion, we need to determine the parameters in this expansion. At LO, there are no parameters to be determined since $F_K/F_\pi$ is 1. At NLO, there is only a single chiral LEC, assuming we Taylor-expand the ratio: the Gasser-Leutwyler [gawh-ser loot-why-lehr] constant $L_5$. But at higher orders, there are many more parameters. At N2LO, there are 11 more; and at N3LO, there are 6 more.

We use 18 different ensembles in our lattice calculation, each of which is a datapoint in our fit. So we have essentially 18 parameters to fit with only 18 datapoints. While a frequentist might deem the endeavor hopeless at this point, a Bayesian would not. We can constraint the parameters by assigning them prior distributions. And from the graph, we see the fit is improved even as we add more parameters to our fit: the widest band has only two parameters if we include a lattice spacing correction, but the narrowest band has as many parameters as we have data points.

We have a rough idea of what the size of our LECs should be based on

We have a rough idea of what the width of our parameters should be based on the size of our expansion parameters. Regardless, we can check whether our parameters are reasonable by using the empirical Bayes method, which uses the data to determine the most likely priors that would support that data.

Empirical Bayes Method

To see how empirical Bayes works, consider a model of $F_K/F_\pi$. Usually when we think of models, we think of the analytic expressions describing some phenomenon – in this case, a chiral expression for $F_K/F_\pi$ to some order. But we can extend the idea of a model to also include the set of priors for the LECs, which I’ll denote with a capital $\Pi$. The chiral expression is then $f$.

Our goal is to find the set of priors for the LECs that is most likely given our chiral expression and data. Using Bayes theorem, we see that a prior is more likely to describe our model when either:

  1. the likelihood function (the blue expression) increases or
  2. we have an a priori reason for preferring some prior over another (the green expression).

The green expression is also known as a hyperprior distrubtion since it is the prior distribution for the priors. The denominator is just a normalization factor.

In our case, we only use the empirical Bayes method to determine the appropriate widths for our priors. We expect our LECs to be order 1, but by probing multiple orders of magnitude, we can verify this hypothesis. Because we don’t typically know the sign of our LECs a priori, we usually set the central value of our priors to 0.

As an example, suppose we are confident in our priors for our chiral LECs, but we aren’t sure about the priors for the discretization terms. We consider the following candidates: all discretization LECs $0 \pm 0.1$, $0 \pm 1$, or $0 \pm 10$, that is we explore the appropriate prior width over three orders of magnitude.

Since we have no a priori reason for thinking any of these priors might be better than another, we can set the prior distribution for our priors (the green expression) to be uniform. Then the most likely prior, according to empirical Bayes, will be the prior that maximizes the likelihood function (the blue expression).

The likelihood function can be readily calculated – it just requires us to marginalize over our parameters. In general this is hard, but for strongly peaked distributions we can use Laplace’s method to approximate the integral numerically.

Finally, there’s one caveat I should mention. The scheme I’ve described assumes that the hyperprior distribution is uniform. But if you start fiddling with the candidate priors too much, that assumption is no longer a good one.

Model averaging

Again, we have 24 different candidate models to describe our data. We give each model a different weight in accordance to the model’s Bayes factor, which we then use to average each model’s extrapolation to the physical point. The Bayes factor is calculated by marginalizing over each of the model parameters and therefore allows us to compare models with different parameters. Additionally, it automatically penalizes overcomplicated models.

Model comparison

In this slide we see how our different model choices impact the model average.

In the top plot, we see that the data prefers $F_\pi$ for the cutoff. In the bottom right plot, we see that the the data heavily prefers using a pure Taylor expansion at N2LO, suggesting we have insufficient data to discern the full, analytic N2LO chiral expansion. In the last plot on the bottom right, we see that Taylor-expanding the ratio at NLO is not prefered.

While not shown here, models with and without the $\alpha_S$ correction have about equal weight.

Error Budget

Next we can break-down where our sources of error come from. The plot on the right largely reiterates what I said before, but there are a few additional things we can glean from it. For example, we have a single ensemble generated at $a=0.06$ fm. If we hadn’t generated this ensemble, our uncertainty would’ve slightly increased and our extrapolation would’ve shift down by roughly half a sigma.

Looking at our error budget, we find that the largest source of error came from statistics, and the second largest source came from discretization, giving us a clear path for improving our result: simply increase the number of configurations and ensembles.

Finally, since the up and down sea quarks are degenerate in our action, we also calculate an SU(2) isospin correction.

Previous Results

Comparing our result with other collaborations, we see that we are our result is in good agreement. The blue band is our result, and the green band is the FLAG average, which is essentially the lattice equivalent of the PDG.

Again, we emphasize that each of these groups is using a different lattice action. Our goal here isn’t to determine the most precise value of $F_K/F_\pi$ but to check that our action is behaving reasonably, in much the same way that experimentalists calculate the same quantity in different ways to check that their equipment is properly calibrated.

So while our result might not be the most precise, we have accomplished the goal we set out to do, which was to verify that our action yields reasonable results.

$|V_{us}|$ from $F_K/ F_\pi$

Finally, as I mentioned at the start of my presentation, we can use $F_K/F_\pi$ to determine $V_{us}$. Using Marciano’s relation, we get the red band. The blue band is the FLAG average for $V_{us}$, which was determined by a different method using semileptonic form factors and the Ademollo–Gatto [ah-di-mall-o gat-o] theorem.

The green band is the experimental result for $V_{ud}$ as determined by superallowed nuclear beta decays. The intersection of the green band and the red band, therefore, yields our determination of $V_{us}$. There’s a little bit of tension between our result and the FLAG average.

Finally, we calculate the unitarity condition for the CKM matrix mentioned before and find that our result supports it.

$F_K/F_\pi$ results

In conclusion, we can calculate $V_{us}$ from $F_K/F_\pi$, which allows us to test the unitarity condition of the CKM matrix. Further, $F_K/F_\pi$ is a gold-plated quantity, which we can use to compare lattice actions. We see that our action gives a result congruent with previous determinations of $F_K/F_\pi$.

Next we’d like to consider the hyperons, which provide a second way of extracting $V_{us}$.

Why study hyperons?

First, what are hyperons and why do we care about them?

  • By definition, a hyperon is a baryon containing at least one strange quark but no quarks of a heavier flavor
  • Historically, hyperons were first discovered around the 1960s. Gell-man observed that these baryons could be classified into a baryon octet and decuplet if the baryons were composed of partons individually obeying an SU(3) flavor $\times$ SU(2) spin symmetry. Using this picture on the left, Gell-man was able to predict the existence of the Omega hyperon, and it was soon discovered few years later.

So why do we still need to study hyperons now?

  • These days we have a more comprehensive model for understanding the particle zoo – namely, the Standard model.
  • One prediction of the Standard Model is that the CKM matrix, which describes flavor mixing by the weak interaction, should be unitary.
  • This leads to the so-called “top-row unitarity” condition. One way to study this is through hyperons, as I will elaborate on later.
  • Some posit that hyperons might be stable over millions of years in neutron stars. Understanding properties of hyperons is important for modeling the equation of state of neutron stars, which dictates how soft or squishy neutron stars are.
  • We know that $\chi$PT works well for mesons but not necessarily baryons. Testing the convergence of chiral expressions for the hyperon mass formulae and axial charges serves as an important for heavy baryon $\chi$PT.

So why bother with hyperon decays?

  • First reason: the LHCb will soon give improved estimates of hyperon decay widths, which should improve this method for estimating $V_{us}$; estimate this source can be competitive if we can determine the lattice part to ~1%
  • Second: although you can estimate $V_{us}$ using either Kl2 or Kl2 decays, the two sources give different results

Why do we need the lattice?

  • Nucleon structure has been well-studied experimentally. Many body bound states of nucleons are abundant (chemistry)
  • Hypernuclear structure much harder to study – requires experiments like the LHCb
  • Difficulty comes from their instability
  • Can’t use them for scattering

Notes:

  • The baryon decuplet/octet comes from SU(6) symmetry (a superset of SU(3) flavor $\otimes$ SU(2) spin, as can be seen from either picture). We have $6 \otimes 6 \otimes 6 = 56 + \cdots$, with the $56$ irrep decomposing as $56 \otimes 10 \otimes 4 \oplus 8 \otimes 2$. Only a single $10$ and single $8$ survive once color symmetry is considered too.

Experimental determination of $V_{us}$

  • To check tow-row unitarity, need to calculate $V_{ud}$, $V_{us}$, $V_{ub}$
  • But $V_{ub}$ is negligible, so mostly just a relationship between $V_{ud}$ and $V_{us}$
  • Unlike determinations of $V_{ud}$, which can be obtained purely from experiment + theory, the best estimates of $V_{us}$ require LQCD.
  • Three ways to estimate: $V_{us}$: kaon, hyperon, and tau decays
  • Historically hyperons were used to determine $V_{us}$, but these days the most precise determinations come from kaons
  • In fact, of these three sources, the widest error currently comes from using hyperon decays

Tension with unitarity (1/2)

  • See this discrepency in this figure from the most recent FLAG review

[Explain plot]

  • $V_{us}$ can either be determined from $F_K/F_\pi$ (diagonal band) or the 0-momentum form factor (horizontal band)
  • $V_{ud}$ is determined experimentally from superallowed beta decays
  • Intersection of vertical blue band with other diagonal or horizontal band yields $V_{us}$
  • The two results are clearly in disagreement
  • In fact, the problem is worse (or better, depending on your perspective) as a check of unitarity
  • If you assume the experimental value of $V_{ud}$ and calculate the top-row unitarity check, the $F_K/F_\pi$ determination gives a disagreement from unitarity at the 3.2 sigma level
  • Likewise, the Kl3 measurement gives a disagreement at the 5.6 sigma level
  • We’d like to use hyperons decays to see which of these estimates of $V_{us}$ is more likely to be correct

Notes:

  • Quoted numbers: FLAG average for $N_f = 2 + 1 + 1$; results are similar-ish for $N_f = 2 + 1$, though less drastic (2.3- and 4.3-sigma)
  • Quoted numbers assume superallowed beta decay estimate; using only the lattice gives a roughly 2-sigma deviation.
  • Dashed line: correlation between $V_{us}$ and $V_{ud}$ assuming SM unitarity.

Tension with unitarity (2/2)

See: https://arxiv.org/pdf/1910.07342.pdf

Project goals

  • Mass spectrum – good first check
  • Next axial charges, vector charges
  • Finally calculate form factors
  • Explain lattice

Previous Work

Not working in a vacuum

Mass spectrum:

  • Hyperon spectrum has been analyzed numerous times
  • Well-known figure from BMW
  • We have a different lattice setup

Axial charges:

  • Less work on hyperon form factors
  • First calculation of hyperon axial charge from LQCD occured in 2007, used only a single lattice spacing & pion mass (Lin)
  • Lin (2018): recent work
  • Lin: first extrapolation of hyperon axial charges to continuum limit; used ratio of hyperon axial charge to nucleon axial charge
  • Lin used a taylor expansion in the pion mass, lattice spacing, and volume – we plan to perform a simultaneous chiral fit, as I’ll elaborate on later
  • Compared to Lin, we will benefit from having more ensembles available in our analysis, including more at the physical pion mass

Vector form factors:

  • Not shown here
  • Work by Sasaki (2017) and Shanahan et al (2015) on the hyperon transition vector form factors

Notes:

  • Lin used ratio instead: (1) cancellations between numerator and denominator, (2) avoid uncertainties from renomalization of the axial-current operator

$\Xi$ Correlator fits

  • We’ve fit most of the hyperon correlators for the ensembles we have
  • Stability plot on right
  • Masses of Xi vs $m_pi^2$ on various ensembles, ranging from $a=0.06$ to $a=0.15$ and 3 ensembles at the physical pion mass

Fit strategy: mass formulae

Consider the S=2 hyperons

  • Explain lambda_chi – dimensionless LECs
  • Examing the chiral mass formulae, we see that the formulae have many common LECs, namely the sigma terms and the axial charges
  • This suggests when we perform this analysis, we should fit the hyperon mass formulae simultanously
  • But this also suggests that our fit will benefit from simultaneously fitting the hyperon masses with the axial charges, which should improve the precision of both the mass extrapolations and the axial charge extrapolations

Hyperon mass spectrum: $\Xi$ preliminary results

  • We have calculated the mass spectrum for the hyperons; as an example, the results for the $\Xi$ are shown
  • We perform 40 Bayesian fits and perform a model average over these models
  • [Explain models]

[Explain plot]:

  • Histogram of all models, sorted by what order chiral terms are included
  • We see that the models that contribute the most to the model average are those without xpt terms included
  • But we see that the models with xpt terms also agree
  • In summary, not yet sure whether we can say xpt is converging for these observables

Summary

[See slide]

Notes:

  • Use Feynman-Hellman fits

The nucleon sigma term

The sigma terms: what are they and what are they good for?

  • $q$ q quark mass shift: comes from Feynman-Helman theorem; for the nucleon-pion sigma term, the sigma term is proportional to the LO correction to the chiral mass expression for $M_N$
  • light quark calculations can be accessed through the lattice
  • heavy quark sigma terms can be calculated perturbatively, while light quark terms can be extracted either through phenomenology or lattice QCD

Previous work

  • On the left is the FLAG average for the nucleon sigma term
    • We see that most of the results tend to converge around 40 MeV, particularly if we concentrate on the green values (which FLAG has determined have better control over systematics)
    • However, most of the phenomenological results (bottom blue) converge around 60 MeV
    • Therefore there appears to be tension between the results from phenomenology and the lattice
  • However, since FLAG 2019 has been compiled, there have been two major additions:
    • First, BMW announced their result, which found a sigma term largely in agreement in other lattice results
    • Second, Gupta et al announced their result, which found a sigma term in agreement with phenomenology.
    • So we’d like to help resolve this tension between lattice and phenomenology or perhaps instead between lattice results

Notes:

  • Gupta attributes their result to better accounting of the contamination from excited states

Expansion of $\sigma_{N\pi}$

  • Now that we have an extrapolation for $M_N/\Lambda_\chi$:
    • We have worked out a novel expression for the nucleon sigma term which allows us to rewrite the expression in terms of dimensionless quantities
    • First we note that the coefficient (the terms in the first set of bracket) come from reexpressing the quark mass derivative in terms of the dimensionless quantity $\epsilon_\pi$.
    • This coefficient has a small correction at $\mathcal{O}(\epsilon_\pi^2)$, but for now it’s sufficient to think of this term as being only slightly smaller than 1, maybe about 0.9
    • More importantly, we see that in this formulation, we can separate the derivative into two expressions: one that only depends on a chiral fit of $M_N/\Lambda_\chi$ and a second term that only depends on a fit of $F_\pi$
  • Here I have used asterisk to emphasize that these quantities are evaluated at the physical point; therefore we don’t need to substitute in the chiral expression for $M_N$, for example – it is sufficient to use the PDG value
  • In the first derivative term, we observe that $F_\pi$ (through the chiral scale) introduces LECs; however, as mentioned previously, these LECs can be grouped with LECs from $M_N$ such that they don’t actually need to be determined
  • However, the situation with the latter derivative term is different – here we actually need a fit to $F_\pi$
  • That said, if we limit our analysis to $\epsilon_\pi^2$, we can get away with using just the FLAG value for $l_4$
  • The most interesting observation here, however, is that the bulk of the contribution to the sigma term actually comes from the $F_\pi$ derivative term. Therefore fitting $F_\pi$ is intergral for determining the nucleon sigma term, assuming we’re interested in expanding beyond $\mathcal{O} (\epsilon_\pi^2)$ or don’t want to rely on the FLAG average for $l_4$
  • everything in red cancels
  • higher order lecs must also be determined

Extra slides

Ideal properties of a lattice action

As I previously stated, there are infinitely many ways of discretizing QCD, which should all converge in the 0 lattice spacing, infinite volume limit. Generically we can write the fermionic part of the lattice action in the following way [motions to equation]. The Dirac matrix, in this picture, is responsible for linking the lattice sites.

Here are some desired qualities a discretized action. The first two, locality and translational invariance, are requirements generic for any QFT. The latter two two are more particular to QCD.

Properties:

  • locality: range of action should fall off exponentially with distance. This can be rewritten as a condition on the Dirac matrix ($\text{det }D \propto e^{-\xi r/a}$); usually we enforce this condition by only allowing adjacent lattice sites to couple, a stricter condition known as ultralocality
  • translational invariance: desired by all QFTs. In this case, translational invariance is equivalent to invariance under transformations of the cubic group
  • chiral symmetry: respect a global symmetry in which the left-handed and right-handed components of the Dirac/quark fields can independently transform under an SU(N) symmetry. This symmetry is slightly broken. I’ll give more details on this later. However, the important takeaway for now is that this condition also adds another restriction to the Dirac matrix.
  • fermion doublers: these are unphysical poles in the quark propagator which sometimes occur when we discretize the Dirac matrix.

Unfortunately, the Nielsen-Ninomiya theorem says it’s impossible to simultaneously satisfy all these conditions in an even dimensional theory. But as we’ll see, it’s possible to circumvent this restriction by adding a fifth dimension to the theory.

Notes:

  • Chiral symmetry:
    • Using the gamma matricies ($\gamma_5$, specifically), we can project out the left-handed and right-handed parts of the quark fields. Then we can write a quark field as the sum of its left-handed and right-handed components. In QCD, in the limit of 0 quark mass, we can transform each of these quarks fields by individual U(N) rotations while leaving the Lagrangian invariant. ($N=2$ for SU(2) isospin symmetry, $N=3$ for SU(3) flavor symmetry.)
    • Limit on Dirac matrix: $\psi \rightarrow e^{i \alpha \gamma_5}$. In the chiral ($m_q = 0$) limit, respecting chiral symmetry is equivalent to saying that the Dirac matrix and $\gamma_5$ anticommute: ${D, \gamma_5} = 0$ when $m_q = 0$. Thus “respecting chiral symmetry” can be thought of as a limitation on the Dirac matrix.
  • Doublers:
    • Technical explanation: Fourier transform into momentum space to determine the quark propagator from the Dirac matrix. For the naive (easiest/most obvious) discretization of the lattice action, there are 15 additional poles poles to the quark propagator besides the one at $p=0$. These unphysical poles are known as doublers.
    • Alternate explanation: doublers arise from periodicity in momentum space. [Thank contributors, DoE]

Really, the nucleon mass?

You might wonder why a lattice practitioner would be interested in the nucleon mass. After all, the mass of the proton and neutron are some of the most precisely determined quantities in physics. In fact, when written in MeV, the conversion factor from atomic mass units to MeV actually introduces the dominant source of error.

  • more generally, we can use the lattice to access observables that are difficult or impossible to access experimentally

Notes:

The sigma terms: what are they and what are they good for?

  • $q$ q quark mass shift: comes from Feynman-Helman theorem; for the nucleon-pion sigma term, the sigma term is proportional to the LO correction to the chiral mass expression for $M_N$
  • light quark calculations can be accessed through the lattice
  • heavy quark sigma terms can be calculated perturbatively, while light quark terms can be extracted either through phenomenology or lattice QCD

Phenomenological significance of the nucleon-pion sigma term

  • spin-independent contribution adds coherently with each nucleon in the atom (eg, xenon, in the case of LUX)
    • in contrast, spin-dependent contributions average to 0 and do not add coherently
  • suppressed by a factor of $\beta^2$
    • here $\beta^2$ is fairly small since the speed of incident particles would be roughly the speed of the sun as it orbits the Milky Way
  • Disrepency between phenomenology and lqcd would introduce the largest uncertainty to this sigma term

Notes:

  • Lagrangian: https://arxiv.org/abs/1805.09795
  • the supersymmetric partners of the Higgs, gauge bosons, etc mix into four mass eigenstates (the neutralinos)

Two Paths to the sigma term

  • 3-point functions also have disconnected diagrams
  • baryon 2-point function is noisy but not as difficult to fit as a 3-point function

Previous work

  • On the left is the FLAG average for the nucleon sigma term
    • We see that most of the results tend to converge around 40 MeV, particularly if we concentrate on the green values (which FLAG has determined have better control over systematics)
    • However, most of the phenomenological results (bottom blue) converge around 60 MeV
    • Therefore there appears to be tension between the results from phenomenology and the lattice
  • However, since FLAG 2019 has been compiled, there have been two major additions:
    • First, BMW announced their result, which found a sigma term largely in agreement in other lattice results
    • Second, Gupta et al announced their result, which found a sigma term in agreement with phenomenology.
    • So we’d like to help resolve this tension between lattice and phenomenology or perhaps instead between lattice results

Notes:

  • Gupta attributes their result to better accounting of the contamination from excited states

Project objectives

  • [briefly summarize]
  • [thank MILC for their gauge configurations]
  • [show new data]
  • Shoutout to the MILC cow

N correlator fits

[talk about effective mass plot, stability plot]

Fit strategy: mass formulae

  • We use $\Lambda_\chi = 4 pi F_\pi$ as an approximation to the chiral breaking scale
  • Expansion in $\epsilon_\pi$
  • We don’t fit $M_N$ directly
    • even though we’ve finished our scale setting, scale setting introduces strong correlations between lattice ensembles which would otherwise be uncorrelated
    • we therefore extrapolate the quantity $M_N/\Lambda_\chi$, which is dimensionless
  • Terms in green come from the expansion of $1/\Lambda_\chi$
    • Notice that the LECs introduced by the expansion can be grouped with other LECs when fitting
    • There are therefore no additional LECs that need to be fit, but there are a few extra logs

Notes:

  • delta contribution is negative ($\sim m_\pi^3$); if we include it, then the $\sim m_\pi^2$ term becomes more postive, and the $m_\pi^4$ term must necessarily be positive; further, since the sigma term is proportional to $\sim m_\pi^2$ at LO, the sigma term will also become more positive
  • delta nucleon mass splitting 290 MeV ~ 2 $m_\pi$
    • therefore it is another mass scale in the theory
    • as it so happens, $g_{\pi N \Delta}$ is a large coupling
    • at large $N_c$, the nucleon and delta become degenerate – delta is necessary in eft; also delta is stable in this limit
    • for some of our ensembles/pion masses, delta is stable
    • as pion mass is increased, mass splitting becomes smaller

$M_N/\Lambda_\chi$ extrapolations

  • Here we have an example extrapolation of $M_N/\Lambda_\chi$
  • Looking at the plot on the left, if we try to trace the extrapolation for each lattice spacing, we see that these extrapolations cross over regularly
    • Thus either the continuum extrapolation oscillates in a very complicated manner, or
    • As suggested by Occam’s razor, the continuum extrapolation is flat
  • The plot on the right shows that a flat extrapolation for the discretization terms reasonably describes the data

  • We could also include the PDG nucleon mass as a point, which could be useful for determining the sigma term; however in practice, we find that including this point has little benefit

Note:

  • specify model that’s being fit

Expansion of $\sigma_{N\pi}$

  • Now that we have an extrapolation for $M_N/\Lambda_\chi$:
    • We have worked out a novel expression for the nucleon sigma term which allows us to rewrite the expression in terms of dimensionless quantities
    • First we note that the coefficient (the terms in the first set of bracket) come from reexpressing the quark mass derivative in terms of the dimensionless quantity $\epsilon_\pi$.
    • This coefficient has a small correction at $\mathcal{O}(\epsilon_\pi^2)$, but for now it’s sufficient to think of this term as being only slightly smaller than 1, maybe about 0.9
    • More importantly, we see that in this formulation, we can separate the derivative into two expressions: one that only depends on a chiral fit of $M_N/\Lambda_\chi$ and a second term that only depends on a fit of $F_\pi$
  • Here I have used asterisk to emphasize that these quantities are evaluated at the physical point; therefore we don’t need to substitute in the chiral expression for $M_N$, for example – it is sufficient to use the PDG value
  • In the first derivative term, we observe that $F_\pi$ (through the chiral scale) introduces LECs; however, as mentioned previously, these LECs can be grouped with LECs from $M_N$ such that they don’t actually need to be determined
  • However, the situation with the latter derivative term is different – here we actually need a fit to $F_\pi$
  • That said, if we limit our analysis to $\epsilon_\pi^2$, we can get away with using just the FLAG value for $l_4$
  • The most interesting observation here, however, is that the bulk of the contribution to the sigma term actually comes from the $F_\pi$ derivative term. Therefore fitting $F_\pi$ is intergral for determining the nucleon sigma term, assuming we’re interested in expanding beyond $\mathcal{O} (\epsilon_\pi^2)$ or don’t want to rely on the FLAG average for $l_4$
  • everything in red cancels
  • higher order lecs must also be determined

$l_4$ dependence

  • Flag reports determinations of $l_4$ ranging from 3.5 - 5
  • Depending on the choice of $l_4$, we could get a value either in agreement with BMW or Gupta

$F_\pi$ extrapolation

  • interested in $l_4$
  • [explain red band ($N_f = 2 + 1$)]
  • emphasize NLO
  • continuum extrapolation supports a large value of $\sigma_{N\pi}$

2021

lsqfit-gui

less than 1 minute read

Graphical user interface for lsqfit using dash.

Back to top ↑

2020

spacetime-plots

2 minute read

A python noteboook for plotting points and lines, expressly written for making spacetime diagrams. To get started with a tutorial, launch the binder inst...

Back to top ↑