Paper Highlight: Best-Practice DFT Protocols…

Computational chemistry has really taken off in the last few decades and is, I daresay, maturing. Rather than commenting on this Grime review directly, I’ll make some general observations from the perspective of a mechanistic chemist.

Ah, B3LYP/6-31G*, our old friend! This model chemistry is overly-maligned. It is widely successful, reasonably priced, even if it has some issues. Vanilla B3LYP doesn’t understand dispersion, but that is easily fixed with modern variants (like Grimme’s -D4). 6-31G* is indeed too small, so energies are inaccurate and complexes too sticky (look up BSSE), but the geometries are reasonable and you can get single points with a bigger basis set. Yes, DFTs are over-delocalized and can’t get H2 dissociation right. But for the most part, dispersion-corrected B3LYP with modern basis sets is pretty good.

Ten years ago, I had a little sabbatical in the Singleton group. It was an intense time. We hunkered down in his office for 12 hour days with deep conversations about every aspect of physical-organic chemistry. “What if we had coupled-cluster-quality calculations for B3LYP cost?” (This is the “heaven of chemical accuracy” in Jacob’s Ladder, see Fig 3). Because not all problems are due to electron correlation or basis set incompleteness, we predicted that it would help, but only in some cases.

With DLPNO-CCSD(T), we have actually reached heaven. (It costs something like expensive DFT in CPU and cheap MP2 in memory.) It’s being widely used to generate benchmarks so that people can work more confidently with DFT. I am optimistic that ML potentials trained on DLPNO data will soon be able to carry out optimizations using autodifferentiation.

And yet, we have yet to achieve “mechanistic accuracy.” We still get many mechanisms wrong. Just because one can calculate a pathway with a reasonable barrier does not necessarily mean the mechanism is correct. Nor do we know how to design reactions, catalysts, or materials prospectively (though we’re working on it).

Why is that? One reason: we can’t estimate entropy or model solvation effects very well. (Computed selectivities like ees are usually too high because enthalpy is captured but entropy isn’t.) Calculations are also only as good as what the operator is imagining. Real systems depend on the unforeseen vagaries of binding constants, aggregation, undesired pathways, etc. Until we have much higher accuracy, the “in silico flask” is still a dream.

As my friend Corin Wagen says on his blog, we need to balance excessive credulity and skepticism. Computations are very useful for understanding reaction outcomes, becoming useful for prospective design, and have real limitations that must be understood and tested by experiment. In the coming years, I expect ab initio and DFT methods to become competitive, ML to help a lot, and predictions/design to get much better.