This weekend I got frustrated by the lack of easy-to-find and up-to-date national- and state-level coronavirus trends. Specifically:
- How many cases and deaths have been reported each day? And,
- What is the rate of change?
So I created my own, using data from the New York Times, which they’ve made publicly available on Github and update daily.
I know that a lot of very smart people — including statisticians, epidemiologists, and medical professionals, which I’m not one of — are researching coronavirus and developing robust models. But their raw data and findings aren’t necessarily updated daily or shared with the public. I’m also aware that mine is a very simple analysis, with no accounting for, say, the availability of test kits, undiagnosed cases, or hospital bed counts.
I’ve run numbers only for the US generally, and for states and counties that impact me most:
- Colorado, Massachusetts, and Michigan, where I live and have family;
- Alaska and California, where I have trips planned in June and July; and finally,
- New York and Washington, which perhaps are harbingers for the country.
If you want other state data:
- Make a copy of the Sheet;
- Duplicate any one of the state tabs, except Colorado, which sometimes includes manual data from our health department that’s not yet in the NYT data; and,
- Enter your state name in cell B1.
To keep your Sheet updated, the raw data must be downloaded from Github and imported. Or just come back here — I’ll be doing that almost daily.
For easier viewing, open this Sheet in a new window.
To edit this Sheet, create your own copy under “File.”
Analysis and questions
March 31
The IHME model suggests that the US will reach its peak on April 15; Colorado, two days later on April 17.
But I’m uncertain how they’re getting this result, at least for Colorado. For five consecutive days (March 26-30), Colorado has reported flat-line or negative growth in both cases and deaths. Under the current circumstances — in which few people seem to be going anywhere or interesting with anyone, at least in Boulder — it’s hard to imagine that a five-fold surge in cases is coming our way.
If a few days of additional data support the possibility that Colorado has flattened its curve, that raises the next question: How do we get out of this without igniting it again?
Wow, quite the effort!
Like everyone else, especially the authorities, I naturally want to see some numbers so I know what is going on.
However, statistics for CV19 unfortunately remain in the “good guess” category, because the raw data is terrible. We probably know “Deaths”, but not “Cases”, because most people can’t get tested, the hospital often doesn’t want you to get tested, and it is not possible to purchase a thermometer anywhere (I tried). So “Cases” are largely a function of how many people are being tested, plus the reporting protocol for each health department.
So … I don’t know. It’s a bizarre situation, in that “Physical Distancing” (“Social Distancing” is an unhealthy thing to do and so I don’t use the term) is mostly all a non-medical professional can do. So I’m definitely doing that, and hope the curve trends down so we can ease up by June.
Keep up the good work!
I am sure you are aware of this, but at least in Alaska, there is a pretty limited amount of testing and the turn around on testing is pretty long. Anecdotally I know of one family who were testing last Monday and still have not gotten test results back yet – so over a week.
Stay heathy!
I don’t see the option to copy the google sheet. I can’t open in my own google sheets page either. If possible could you add TX to the tabs?
Curious how ‘accurate’ the data is as I’ve been hearing so much of people with symptoms being refused a test here unless they are going to be admitted to the hospital, which leads me to believe numbers of actual cases are considerably higher. None of those are reported on or listed in state or county reports. I don’t have a fix, but just concerns about the real impact.
Should be fixed now. Here’s the link you need, https://docs.google.com/spreadsheets/d/1OlKEuDdNbpzi6so3HVQZ0_jkI24r2XBHzmnae0OGiGU/edit?usp=sharing
Thank you, much appreciated!
Hi Andrew, Here is a great site for what I think you’re looking for: https://www.politico.com/interactives/2020/coronavirus-testing-by-state-chart-of-new-cases/
Also, a link to the source data: https://covidtracking.com/
The map with slider for date graphically showing number of tests and number of positives is very enlightening.
This was helpful, thanks. I’ve been looking for better visualizations so the tips here made it pretty easy to just get the raw data and plot my own. 🙂
Check out worldometer. Also, check out the YouTube channel Medcram for his daily updates. He provides a lot of links to get at virus data.
The biggest challenge with US data is the lack of widespread testing.
Agree…I know at least 6 people personally who almost surely have it and were never tested. Turn around for testing now is about 24 hours (possibly earlier if you get testing in right before they run the batch)
91-DIVOC.com is a great visualization tool as well, and seems to be updated daily. Data is viewable by state and by country, on both linear and logarithmic scales. No maps, but a powerful way to see a lot of data at once.
Andrew,
Thanks for pulling together these data. I found these two pages to have good data and visualization.
See chart 4, COVID-19 Cases by US States/Territories, normalized by population
http://91-divoc.com/pages/covid-visualization/
and of course the compiled source data from John’s Hopkins
https://coronavirus.jhu.edu/map.html
Stay safe!
https://ncov2019.live/data
https://covid19.healthdata.org/projections?fbclid=IwAR08GeJ_ZSy4Eeqg4uTd68lCxymNOzbla3szuMvqq2_ocRJQPFdXQQ-lhhE
This is a good resource for predicting surges by state with respect to hospital resource allocation. This content of personal interest to me but others may find helpful.
I wanted to follow up with one more link to a FiveThirtyEight article talking about how it’s hard to make a data model for this pandemic. Granted this is predictive data and not analyzing existing data, but I thought the commenters on this post would find it interesting.
https://fivethirtyeight.com/features/why-its-so-freaking-hard-to-make-a-good-covid-19-model/
Wow … turns out backpackers like data! (I’d like to see a chart tracking everyone’s Base Weight by age, state, number of years backpacking, with modeling to predict when it will be at its lowest – my guess is an inverted bell curve with the lowest weight at 5 years …)
I’m a day late with this, but here is the website I check every morning for my daily guidance – you really need to see this, only takes seconds (and if helpful, also take the Quiz): http://wisdomofchopra.com/
For a critical view you can check out this page:
https://swprs.org/a-swiss-doctor-on-covid-19/
NYT seems to have pages upon pages dedicated to interactive modeling.
https://www.nytimes.com/interactive/2020/us/coronavirus-us-cases.html?action=click&module=Spotlight&pgtype=Homepage#cases
Just another reco for a site I think you’d like. It allows you to look world wide, by country, by state, and by county and the visualizations are really useful. https://gisanddata.maps.arcgis.com/apps/opsdashboard/index.html#/bda7594740fd40299423467b48e9ecf6
I have been finding most of my data through the subredit r/dataisbeautiful. The subredit has nothing to do with Covid, obviously, but about 80% of the current content is Covid specific.
Like many, I’m also wondering what opening back up will look like…I assume it’s not just going to be a big bang where we all go back to normal. Instead, I am assuming it will look like a stepped re-introduction of different types of businesses opening up. Sort of like the stages of the shutdown, but slower.
At the beginning of this crisis I heard one expert say, “people are talking about this as if we have to get ready for the Corona snowstorm. It’s not the Corona snowstorm, it’s the Corona winter.” As someone who’s been in work from home mode for four weeks right now…sounds about right.
I’d be wary of using “data” from the New York Slimes.
A couple colorado specific ones:
https://data-cdphe.opendata.arcgis.com/datasets/colorado-covid-19-positive-cases-and-rates-of-infection-by-county-of-identification
Not data, but the results of the CDPHE model, ostensibly being used by CO Govt to figure out if SD measures are effective:
https://covid19.colorado.gov/press-release/state-releases-new-modeling-findings