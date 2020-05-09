|
The ‘elite’ should learn to code (better)
“Science is the belief in the ignorance of experts.” (Richard Feynman)
Over a decade ago, climategate confirmed that Jones, Mann and Briffa knew exactly what they were doing when they scaled the hockystick to hide the decline while having not a clue about what the decline meant. However it incidentally revealed the, uh, ‘quality’ of their code.
Neil Ferguson’s extra-lockdown/marital escapade says much about his elite opinion of us common people (and of the lockdown), but meanwhile someone has taken a look at the, uh, ‘quality’ of his code.
Conclusions. All papers based on this code should be retracted immediately. Imperial’s modelling efforts should be reset with a new team that isn’t under Professor Ferguson, and which has a commitment to replicable results with published code from day one.
On a personal level, I’d go further and suggest that all academic epidemiology be defunded. This sort of work is best done by the insurance sector.
Read the review to see what leads to these conclusions. You have to laugh (in order not to cry 🙂 ).
|
The populist right continue to blame anyone (the media, scientists, China ect) but the government itself for lockdown, because they have put so many of their hopes in the figure of Boris Johnson. As one MP has said “We can’t blame the PM so we’ve decided to blame the advisers”.
If Boris announces only minor changes to current policy tomorrow I wonder if the anti lockdown right will start to turn their fire on the government. My guess is most will just continue to blame ‘The Liberal Elite’, a usefully nebulous scapegoat if ever there was one.
Let us say, for the sake of argument, that the mathematical death model of the of Professor Ferguson had been CORRECT – not wildly wrong.
This would still in no way have justified the “lockdown”, there was no evidence that locking up the population would stop the spread of the virus. Vast numbers of people would have died – either way (with or without the “lockdown”) And that assumes that the purpose of the “lockdown” was to combat the virus – which it is now clear that it WAS NOT.
The purpose of the “lockdown” was to spread the sort of social control that the Green activist that Professor Ferguson was involved with wanted – but it is not JUST her (or him). It is the entire international establishment – including much of Corporate Big Business.
Some “conspiracy theories” are true – that does not mean they released the virus on purpose (it may have been a genuine accident in their internationally funded lab near Wuhan) – but “NEVER LET A CRISES GO TO WASTE”.
The “educated” international establishment were eager to use the virus as a excuse for what they have long wanted to do – the destruction of liberty is their aim, the virus is just an excuse for actions they have long wished to undertake anyway.
People such as various “Progressive” Governors in the United States make this very clear “close the Churches – but keep the abortion mills open”.
No medical treatment for cancer or heart disease – but if you want to cut a baby up, there is plenty of hospital space open to you. But why one baby – let us cut up LOTS of babies. And why stop when the baby is born – both the Governor of Virginia and the Governor of New York (Media Darling Andrew Cuomo) say it is fine to kill babies AFTER they are born.
You want to go to Church? How dare you! Men with guns will burst in to prevent religious services (as they did in France – President Macron).
You want booze and drugs? Excellent! We want a degenerate population dependent on us for booze and drugs – have them for free (as they are doing in California) and we will put you up in a hotel – a hotel we will STEAL (as they are also doing in California).
“You wish to be critical of our allies the Communist Party Dictatorship in China? HOW DARE YOU. You RACIST. This is HATE SPEECH” the entire council of once conservative San Antonio (in Texas – not officially yet Mexico) voted to end Freedom of Speech – no doubt that will help treat the Chinese Virus (oh dear I just committed a crime).
It goes on – round the world.
They are realising criminals (including murderers) from prison whilst, at the same time, disarming the honest citizens. This (both parts) is the policy of the government of Canada – and many other governments (fully supported by the international establishment elite, including much of big business).
The only good thing is that it is all out in the open now.
These are not good people who happen to have another point of view – they are utterly evil people who have taken the virus as an excuse to engage in the destruction of liberty they have long longed to do.
One an not “reason with” a vicious individual such as the Governor of Michigan – one can only defeat them, or be destroyed by them.
One needs to take the “new normal” threatened (as an eternal destruction of liberty) by the Prime Minister of the Republic of Ireland, and the rest of the international establishment elite, and shove their “new normal” down their throats.
rosenquist makes a good point – and I have myself made excuses for Prime Minister Johnson and the rest of the elected politicians in this country, and I APOLOGISE FOR THAT.
Yes it is very difficult to say NO to the “experts” and “public servants” – but the elected government does have the power to say NO.
If the Prime Minister in his address on Sunday does not give a clear and early date for the end of “lockdown”, then what remains of his credibility will be gone.
As for the United States – only four (4) members of the House of Representatives voted against the bailout orgy that will, de facto, bankrupt the United States.
No other members of the House of Representatives (other than these four) deserves any respect – certainly not the “libertarian” Justin Amish who was given the chance to vote NO and did not do so.
The British House of Commmons?
Looking past their tears to their actual voting behaviour – none of them (none of them – of any party) have behaved well.
The idea that the world economy was shut down due to simulations run on code that hasn’t been can’t be regression tested is really something.
I mean the model isn’t even wrong. It’s just garbage.
“Read the review to see what leads to these conclusions. You have to laugh”
I did. But more at the review than the code. If this is the worst issue they can find, then I’d say it’s not a problem.
Of course, we’ve only just started. I appreciate the link to the code – I’ll definitely have a look later on – and I agree this is how science ought to work. And I might point out, this is hugely more impressive than the climate scientists, who were still resisting releasing their code decades later.
Monte Carlo methods work this way by design. You’ve got a hugely complicated statistical model that you can’t find the distribution for analytically. So what you do is you pick random input values covering the range of uncertainty, run the model on each, and you get a random spread of outputs that shows you the range of possible outputs fitting with the input. It’s like testing the accuracy of a rifle by picking it up and firing lots of shots at the target. The way you pick it up and hold it is slightly random, the wind and temperature and density of the air are random, the amount of propellant in each bullet and the weight and shape of each bullet is random, and so the spread of holes in the target is random. If you can get them all within a handspan of one another, that’s pretty accurate.
It’s a nice property to be able to replicate the test exactly – same handling, same air currents, same bullets. It makes it easier to identify problems. But nobody would say that rifle accuracy trials were useless because they could not be exactly replicated. What’s important is that you can replicate the size and shape of the spread, not the exact location of each hole. And this code review doesn’t even mention that.
But we’ll see. I await further developments with interest.
Niv,
Apparently the sim code does not reproduce output even in a single thread environment.
Assuming there are no hw interrupt or synch issues, which there should not be, then in my experience the most likely problem is failure to initialize some part of memory correctly. This can cause “random” results depending on the state of the hw at the start of each run. There are other possibilities.
Back when I was doing this stuff for pay, I’d usually resort to a complete power cycle of the system in an attempt to reproduce such behavior. But it’s definitely a bug.
And yes, I do know how rng’s and seeds work.
The non-portability to other computer systems is indicative of a related class of poor coding practices, which can also cause non-reproducability.
What regression testing is there for the human brain?
This especially when the code examined is not the original code? That requires pregression testing.
Gross QA policy failure here. Pot calling the kettle black.
It’s not that that ‘expert’ is not wrong, but that the code analysis ‘evidence’ is junk.
Keep safe and best regards
“Assuming there are no hw interrupt or synch issues, which there should not be, then in my experience the most likely problem is failure to initialize some part of memory correctly.”
The most likely problem, I suspect, is that they’re not caching the state of the random number generator along with the tables of results.
So the first run initialises the random number generator, reads out a few thousand random numbers to generate the tables, caches them to a file, then carries on with more random numbers being used to generate the final results.
The second run initialises the random number generator, loads the intermediate results tables from the file, then carries on reading out more random numbers to get the final results. But because these random numbers are now coming from the start of the sequence, not a few thousand values in, the results are different.
If you want exactly replicable results when caching intermediate values, you also have to cache all the random number generator states, or reset them again to new starting seeds, or use different sub-streams before and after the cache step.
I don’t know what the problem actually was, but at first glance at the description of what they did, that’s where I’d start investigating.
Oh, and Fred the Fourth + a million.
And maybe NiV too – though without evidence from the original code and dynamic examination, how would one know?
The person who wrote that doesn’t seem to grasp what the review explained. Well designed programs using Monte Carlo methods use pseudo-random number generators that take a “seed” value to initiate them. Given the same seed, the pseudo-random number generator will produce the same sequence of “random” numbers. That allows a Monte Carlo simulation to *exactly* reproduce the same result each time it is run using the same seed. If it doesn’t, that means there is a bug. The Imperial model code does *not* reproduce the same results each time and is therefore buggy.
If someone wrote a program to solve a mathematical equation that has a single result, it should produce the same answer each time. If it produced the result 40 once and then 400 the next time it ran: people would consider it buggy and have no reason to know whether *any* of the results it produced were accurate. Instead, the approach used by the Imperial College model folks would be to average the 40 and 400 results and come up with 220 and say that is the answer (or whatever other average result from more runs): when there is no reason to know for sure what the nature of the bug was and whether the result was anywhere in that range at all.
A stochastic simulation should be able to be run more than once to produce the exact same results: it merely internally runs multiple simulations with different random numbers started from the same seed, or is run multiple times given different seeds. Ideally you’d have more than one person/team produce a simulation model using the same specs: and then test them running against each other.
Even aside from the problems with the code itself: there are problems with the underlying nature of the model and the assumptions its used. This is the same modeling group that in the past produced results that provided to be wildly out of touch with the reality of what happened: and yet they don’t seem to have ever gone back to understand why and updated their methods since they continued to produce results that predicted absurdly high results compared to what wound up happening.
Most epidemiological models are too simplistic since they assume the same constant R, the number of people a virus infects, for the whole populace. I haven’t examined this one, but I suspect it does the same thing, whereas in many cases there are “superspreaders” that are responsible for a disproportionate number of infections. In this case I’ve seen at least one study indicating that may be true of this virus, though I haven’t checked for critiques or confirmation:
Epidemiology and Transmission of COVID-19 in Shenzhen China: Analysis of 391 cases and 1,286 of their close contacts
Most models just use a static average R but neglect to take into account the impact of these superspreaders on how R changes over time. A top level analysis suggests this should undermine the typically projected exponential spread. If superspreaders come into contact with many people to spread the disease, they may also be more likely to come into contact with other superspreaders earlier than the rest of the public and therefore get the disease earlier. As superspreaders become immune, that would lower the average R and therefore lower the level required for herd immunity. Picture the case where all superspreaders were immune, the average R would drop drastically.
At least one study has attempted to take into account the variance of susceptibility to the virus which also varies, though I haven’t delved into it to see how credible it is or looked for critiques:
Individual variation in susceptibility or exposure to SARS-CoV-2 lowers the herd immunity threshold.
PS, here is a page from the Santa Fe Institute on the impact of a populace with superspreaders where R varies:
Transmission T-024: Cristopher Moore on the heavy tail of outbreaks
“If someone wrote a program to solve a mathematical equation that has a single result, it should produce the same answer each time. If it produced the result 40 once and then 400 the next time it ran: people would consider it buggy and have no reason to know whether *any* of the results it produced were accurate.”
The solution to the equation in this case is a statistical distribution. The ‘answer’ is that it should output a number between 10 and 500, say, with a particular complicated, non-analytical distribution. Run it 1000 times, it comes up with various numbers between 10 and 500. If you tweak a line that shouldn’t change the output and run it again and it suddenly comes up with 50,000, that’s a problem. But if the right answer is “between 10 and 500”, then both 40 and 400 are perfectly satisfactory.
Writing simulations that can be made deterministic by setting the seed is useful for debugging, and comparing differing implementations, and in particular for producing automated regression tests that can be run without the application of any intelligence, but it’s actually a bad idea to do it when doing the science. If you always use the same sequence, then you’re not exploring the full range of variation that genuinely random inputs will give you. If the particular bit of sequence you pick happens to miss a problem case that would trigger a failure, and you keep on using the same set of random numbers, you’ll never see it. A better approach for the science is to generate new seeds each time you run, but make a record of them so if such a problem does show up, you can debug what happened.
I agree absolutely that it’s good practice and very useful if the simulation has this property, but it’s also very hard to do, takes a lot of extra time and effort to get right, and is not as critical to the correctness of the scientific results as the article might give the impression is the case.
And yes, using a fixed R0 for the entire population and using average values for transmission rates misses rare events out in the tails or in particular sub-populations, which is precisely why these sorts of models use Monte Carlo methods to model the population in much finer detail than the SIR-based models everyone else uses.
It’s great that people are looking at it now, but I’ll wait until they find something more interesting before I get excited. As for the practices of the software engineering community, I’d be curious to know why my PC spends time every month downloading software/security patches if commercially engineered software is all bug-free? ‘Microsoft!’ used to be a swearword, round my way. I’ve never come across any large bit of software that didn’t have bugs, despite all the trendy methods they use, and some of the worst were the ones engineered by the strictest, most formal and bureaucratic procedures. (Let’s not even talk about the ‘Waterfall’ model…) What’s more important is whether it’s open to examination and continual testing, and whether they fix problems when they find them. It’s the process that matters.
This getting to be a surprising habit, but I largely agree with NiV.
In particular, I have great problems with the Imperial team’s modelling. However I have not seen a single criticism of Imperial’s modelling in Sue Denim’s paper.
You have got to distinguish between a model (an attempt at a mathematical idealisation of reality, probably including some randomness) and the code, which is an attempt to implement that model.
It is one of the major fallacies of the 21st century to confuse these two.
Indeed, I would argue that “ensemble modelling”, as perpetrated by climate scientists, is, in part, an example of that fallacy in practise.
Here are a few propositions that i might defend in future comments.
* Whether the ICL model is correct or not, is of no more practical interest than whether Prof. Ferguson’s lover is married or not.
* One thing that IS relevant, is whether the ‘lockdown’ worked. That is up for debate, since in many countries +US states the ‘lockdown’ was imposed at the same time as distancing. How could we possibly know whether distancing alone would have been just as good?
* In many countries +states, people talk about a ‘lockdown’ when in fact all what happened is no more than enforced distancing.
* Almost all of the economic damage would have been achieved anyway, due to people being rationally afraid of catching the virus. Most of the rest of the economic damage could have been avoided by hiring intelligent policepersons.
* Absence of evidence is not evidence of absence. The fact that we do not have a good model is exactly why governments should have closed borders, enforced masks +distancing, and done as many tests as possible **EARLY**. Intelligent governments have done 3 or 4 of the above. Boris did none of the above until it was TOO LATE.
* Paul Marks has an increasingly tenuous grasp of reality; but more firm than e.g. Mr Ecks.
I would suggest that the whole argument is moot anyway as, just as it is impossible to predict the weather more than a few days into the future, the progress of a virus is likewise impossible to predict. There are just too many variables and unknowns to make it work, even if your computer program is perfect and replicable.
Niv,
Cache…
Yup, quite possible.
Way way back at Big Silicon Valley Co. we used the term (from biology and other places) “emergent behavior” to describe the unpredictable behavior of systems with multiple interacting caches.
But we also ran automated regression testing round the clock. Good design is one thing, but so is Defense In Depth.