Can we Contain Covid-19 without Locking-down the Economy?

Shai Shalev-Shwartz, Amnon Shashua

Coronavirus illustration created at the Centers for Disease Control and Prevention (CDC)

The authors wish to thank the faculty members of the computer science department of the Hebrew University, as well as to Prof. Peter Bartlett, Prof. Nir Friedman, Prof. Katrina Ligett, Prof. Nati Srebro, Prof. Herve Bercovier, and Dr. Renana Eitan for comments and feedback on earlier drafts of this paper.

In this article, we present an analysis of a risk-based selective quarantine model where the population is divided into low and high-risk groups. The high-risk group is quarantined until the low-risk group achieves herd-immunity. We tackle the question of whether this model is safe, in the sense that the health system can contain the number of low-risk people that require severe ICU care (such as life support systems).

One could consider three models for handling the spread of Covid-19.

  1. Risk-based selective Quarantine: Divide the population into two groups, low-risk and high-risk. Quarantine the high-risk and gradually release the low-risk population to achieve a managed herd immunity of that population. The managed phase is designed to allow the health system to cope with the expected number of severe cases. Given the herd immunity of the low-risk group, we can gradually release the high-risk population. The question is how to manage the release from quarantine of the low and high-risk populations in a way that will not overwhelm the health system.
  2. Containment-based selective quarantine: Find all the positive cases and put them in quarantine. This requires an estimation of [t0,t1] the “contagious time interval” per age group, then given this time interval one could recursively isolate all the individuals at risk from a person that is carrying the virus using “contact tracing”. Another tool is predictive testing using contact-tracing to identify people with many contacts with other people and perform tests on them.
  3. Countrywide (or region-wide) lock-down until the spread of the virus is under control. The lock-down could take anywhere from weeks to months. This is the safest route but does not prevent a “second wave” from occurring.

Models 2,3 could work in tandem and have been tried in China and Singapore. Model 3 is currently the default model around the globe and naturally has a tremendous crippling impact on the economy. In the remainder of this article, we derive some tools for analyzing the viability of the risk-based model. Specifically, what level of sampling and confidence level can be obtained to make sure that the health system can contain the model?

Covid-19 spread over time (in terms of ICU cases) without applying ant safety measures.
Covid-19 spread over time (in terms of ICU cases) when applying countrywide lockdown (model 3)
Covid-19 spread over time (in terms of ICU cases) when applying the proposed Risk-based model.

Consider a plausible definition of a high-risk group based on a cut-off age and certain pre-existing conditions. For the sake of concreteness, assume the cut-off age is 67+ which represents the retired segment of society. The low-risk group is the remainder of society which is released to their daily routine while following certain distancing protocols that are aimed at slowing the spread, while keeping the economy un-disrupted to a large degree, but ultimately reaching a herd immunity level. At that point it is safe to gradually release the high-risk group from quarantine. The question is how do we guarantee that the health system will not be overwhelmed during the spread of the virus in the low-risk group?

Let b be the number of severe cases, e.g., those that require an intensive care unit (ICU) or specifically respiratory systems, that the health system can handle, say b = 600 for a country of the size of Israel¹. Let m_d be the number of low-risk people that will develop severe symptoms and will require an ICU assuming we will adopt the risk-based selective model. Then, the model is “safe” if b > m_d. Our goal is to derive an upper bound on m_d, so that we’ll be able to ensure that b > m_d. Let m be the size of the low-risk group and let ν be the probability that a person that comes from the low-risk group will develop severe symptoms, assuming the person is currently sick. Then,

So, to ensure that b > m_d we will require that ν < b/m.
Before we continue, we note that fully understanding the dynamics of the spread of the virus and the dynamic of the development of the disease (for the sake of knowing when we will need ICUs and for how long) is very challenging and will probably require much more research and time. What we propose here is a worst-case analysis. The idea is to adopt a pessimistic view and show that even under this pessimistic view, the health system is not likely to collapse. We already made a pessimistic assumption since we did not take into account the fact that not all of the low-risk population will get sick, and even those that will get sick will not get sick at the same time and will not need an ICU at the same time.

Continuing our derivation, let p* be the current, unknown, percentage of positive cases among the low-risk population and let k be the number of severe cases among the low-risk population from today until one week from now. Assuming that people that are positive cases will either develop severe symptoms within a week or will never develop severe symptoms, then we can estimate ν as follows:

Note that again we are taking a worst-case view. Maybe some of the severe cases in a week’s time will be due to people that are not infected today but will get infected tomorrow. This is even likely when the pandemic grows at an exponential rate. Next, we turn the approximation into inequalities by deriving an upper bound on ν using a measure concentration inequality. While it is possible to rely on generic measure concentration bounds (e.g. Chernoff’s bound or Bernstein’s inequality [2, 3]), we rely on tighter bounds specific for the Binomial family due to [5].

Lemma 1 Fix some δ ∈(0,1) and for every integer k let k̃(k, δ) be the minimal integer such that k > k̃ + 1 and

where Φ is the cumulative distribution of a normal distribution. Assuming that p*m > k̃, we have that, with a probability of at least 1 — δ,

The proof of the lemma follows directly from Lemma 3 in the appendix. As an example, if we take δ = 0.05 and k=15 then =24. To show the tightness of this bound, observe that for a Binomial variable Sn ∼ Binomial(p, n), where in our case n = p*m and p = k̃ /n, we have that (Snpn)/(pn(1 — p))^0.5 behaves like a normal variable when n→∞. Therefore, if we want a confidence level of δ = 0.05, then at the limit, we need that (k — )/√ ≈ -1.65 (because Φ(-1.65) = 0.05). This will happen when:

The graph below shows, as a function of k, the minimal value of that guarantees δ ≤ 0.05 according to Lemma 1, compared to the right-hand side of the above equation. As can be seen, the two curves are very close to each other.

Getting back to our derivation, based on Lemma 1, we can upper bound m_d (with a probability of at least 1 — δ) by

since even if all of the young population will get the virus, the number of severe cases will be the right-hand side of the above equation. Of course, in reality we expect m_d to be much smaller than the above, both because not all of the young population will get the virus and because not all the severe people will be sick at the same time. Nevertheless, as mentioned previously we adopt a worst-case analysis.

Next, we turn to derive a lower bound on p*. To do this, we will sample n persons, uniformly at random from the low-risk population, and will derive a lower bound on p* based on the number of people that came out positive. In particular,

Lemma 2 Fix some δ ∈(0,1) and integer n, and for every integer r let r̃ be the maximal integer such that r — 1 > > 1 and

where Φ is the cumulative distribution of a normal distribution and D(p, c) = c log(c/p)+(1 — c) log((1 — c)/(1-p)). Let n be a sample size, let p*∈ (0, 1), and let r be the number of positive cases we found in the sample. Then, with a probability of at least 1 — δ we have that p* > r̃/n .

The proof of the lemma follows directly from Lemma 4 in the appendix. All in all, we obtain that (with probability of at least 1–2δ),

The health system will be able to treat all of these severely sick people if b>m_d, so a sufficient requirement is that

To get some intuition on the effect of the sample size n, suppose that we set δ=0.05, and suppose that p* = 0.02.
The graph below shows our lower bound as a function of n.

We see that when n = 1000 we lose a factor of 2.3, and when n = 5000 we lose a factor of 1.25.

In the event a risk-based quarantine approach would be contemplated by decision-makers, the purpose of this document is to provide decision-makers a formal and tight bounds to investigate whether the health system can cope with the number of severe cases that would reach ICU. Embedded in the reasoning is the idea of selective quarantine (based on age groups and existing pre-conditions, but could be any other criteria) where the ”high-risk” group (the one we suspect will have a high rate of severe cases) is quarantined and the other is allowed to spread the virus under certain distancing protocols. The underlying premise is that a full population-wide quarantine is not a solution in itself — it is merely a step to buy time followed by a more managed (non brute-force) approach. The managed phase underlying our thinking is to create herd immunity of the low-risk group in a controlled manner while keeping the economy going. It is all about keeping the health system in check and not overwhelming its capacity to handle severe cases. The question we ask in this document is whether we can estimate in advance, through sampling, that the number of severe cases arising from the low-risk group would not overwhelm the system?

Taking Israel as a case study, as of March 30, k = 15 (out of 74 severe cases in ICU). As of today, there is no scientific estimation of p* only indications. The graph below depicts our bound as a function of p*, ranging from 0.02 to 0.1, when we fix δ = 0.05, k = 15, and n = 5000. We also depict the values of /p*, which will be the bound (for b the number of critical beds) had we known p*, and the value of k/p*, which is the approximation of m_d based on expectations.

Worth noting that the bound is fairly tight (for n = 5000) comparing m_d with /p* (i.e., plugging in the correct p* rather than the lower-bound r̃/n). We also see that the range of m_d is from around 1,500 to 400 which translates to 15 to 4 critical ICU beds per 100,000 inhabitants. Given that Israel has, in normal periods, 6 beds per 100,000 inhabitants, the additional beds is something that Israel can reasonably handle. Talks are to go up to 50 critical beds during this crisis — therefore, the analysis indicates that releasing the low-risk group to achieve herd immunity is not unreasonable.

Some additional points worth mentioning. The random variable k could be a very useful indicator to decide on what constitutes the high-risk group. What age cut-off and what preexisting conditions to include. At any given time, we would want the definition of the high-risk group to create a small value of k (as the capacity b monotonically increases with k). For example, in our case study, we decided on a cut-off age of 67 and looked among the 74 severe cases for those without any pre-existing conditions. In order to avoid over-fitting this kind of study should be done with data preferably coming from other countries.

Another point worth mentioning is that the risk-based quarantine model is not only beneficial form the point of view of economical sustainability. Among other selective quarantine ideas (like based on geography or contact-tracing isolating the infected and those around them) the risk-based approach has better chances of reducing the overall mortality rate. The reason is that the highest mortality is with the high-risk group which in this model is isolated.
When the high-risk group is released from isolation they would be facing a largely immune population thus naturally facing a very slow spread of infection with a good chance to whither the storm until a cure or vaccine is available. In all other selective quarantine models the high and low risk are equally susceptible to be infected so that even if the health system is not overwhelmed still the mortality of the high-risk group is likely to be higher than the risk-based model.

Yet another point worth mentioning is that we focused on what is ”safe” for the health system in the sense of how to estimate the number of severe cases that would be low enough not to overwhelm the ICUs. We ignored the fact that some severe cases could end up in the mortality statistics even when given proper care. In fact, there are two probabilities to estimate (i) the probability of being in the ”severe” category among the low-risk group, and (ii) the probability of mortality given proper care. We have bounded the former and ignored the latter. The reason for doing so is that the latter is beyond the scope of this paper because it is essentially a moral tradeoff between ”safety” and ”usefulness” that is employed in every aspect of society. For example, society does not put a lockdown on passenger car use in order to significantly reduce car accidents even though such a lockdown will save lives. Likewise, governments do not allocate infinite budgets for the health system even though there is a correlation between increased investments and saving lives.

As a final remark, going out of quarantine is a choice, not an obligation. This is no different than people that are afraid of flights and decide not to go on an airplane. Families can decide to stay quarantined either as an extra safety measure or if some members of the family are from the high-risk group while the others are from the low-risk group.

Appendix A- Proofs

The proofs rely on the analysis given in [4], and in particular, on the following theorem due to [5].

Theorem 1 Let Sn ∼ Binomial(n, p), where p ∈(0, 1) and n > 0, then for k ∈ {1,….., n — 2} we have

where Φ is the cumulative distribution function of a standard normal variable and D(p, c) = c log(c/p) + (1 — c) log((1 — c)/(1 — p)) is the KL divergence.

Lemma 3 Let k, be two integers such that k > + 1 and let δ ∈ (0, 1) be such that

where Φ is the cumulative distribution of a normal distribution. Let n > k̃ be an integer and let p (0, 1). Suppose that Sn ∼ Binomial(n, p). Let a null hypothesis be that p ≥ k̃/n . Then, if Sn ≤ k, we can reject the null hypothesis with a probability of at least 1 — δ.

Proof If p < k̃/n there is nothing to prove, so let us assume that p k̃/n , and observe that the extreme case is when p =k̃/n. Note that in this case we also have that np > k + 1, and therefore by Theorem 1,

Denote f(k) = — (k + 1) and observe:

Let’s analyze the function h(k, n). For a fixed k, we show that h(k, n) monotonically decreases when n increases:

where we used log(1 + x) ≤ x for x > -1.


which concludes our proof.

Lemma 4 Let r, r̃, n be three integers such that n > r, ≥ 1 and r > + 1, and let δ ∈ (0, 1) be such that

where Φ is the cumulative distribution of a normal distribution and D(p, c) = c log(c/p)+(1 — c) log((1 — c)/(1-p)). Suppose that Sn ∼ Binomial(n, p). Let a null hypothesis be that p ≤ r̃/n. Then, we can reject the null hypothesis with
probability of at least 1 — δ if Sn ≥ r.

Proof If p > r̃/n there is nothing to prove, so let us assume that pr̃/n, and observe that the extreme case is when p = r̃/n. By Theorem 1 we have

which concludes our proof.

Appendix B- Estimating p∗ from pooled tests

A pooled test is obtained by taking a sample from T persons, mixing all of these samples, and searching for traces of the virus. If the pooled test is positive, it means that at least 1 of the persons is positive. The reduction below shows how to utilize pooled tests.

Let ϕ* be the probability that for a random pool of size T we will have ST ≥ 1, where ST is the number of positive cases in the pool. Note that by definition of the pool test, it ends up positive if and only if ST ≥ 1. Observe that

Hence, if ϕ* ≥ ϕ̃ we have that

Rearranging terms, we obtain that p*≥ . In other words, to show that p*≥ it suffices to show that ϕ* ≥ ϕ̃.
The gain from the pool is clearly observed if p̃T ≪1. In this case, using the approximation 1−xe⁻ˣ we have that ϕ̃p̃T. Hence, the number of pooled tests we need is roughly T times less than the number of regular tests we need, in order to show the same conclusions on p*.

[1]Israel has around 6 critical beds per 100,000 inhabitants.

[2] S. N. Bernstein. On a modification of chebyshev’s inequality and of the error formula of laplace. Annals Science Institute Sav. Ukraine, Sect. Math. 1, 1924.

[3]S. Shalev-Shwartz and S. Ben-David. Understanding machine learning: From theory to algorithms. Cambridge university press, 2014.

[4]M. Short. Improved inequalities for the poisson and binomial distribution and upper tail quantile functions. ISRN Probability and Statistics, 2013, 2013.

[5]A. M. Zubkov and A. A. Serov. A complete proof of universal inequalities for the distribution function of the binomial law. Theory of Probability & Its Applications, 57(3):539–544, 2013.

CEO of Mobileye, SVP at Intel, Co-CEO of OrCam, Chairman of AI21labs & Sachs Prof. of Computer Science at the Hebrew University of Jerusalem