Source: Benjamin Van Vliet and Robert Hendry. Modeling Financial Markets: Using Visual Basic. NET and Databases to Create Pricing, Trading, and Risk Management Models. McGraw-Hill Pub. Co., 2004. [B086]
Getting clean data isn't easy:
Despite the efforts of primary, secondary, and sometimes even tertiary data vendors, data are often either missing or incorrect in some way. If ignored, this problem can lead to disastrous consequences for the quant. […] It's worth noting that although some of the following data problems seem egregious or obvious to a human, it can be challenging to notice such problems in a trading system that is processing millions of data points hourly (or even within one minute, as in the case of high-frequency traders). The first common type of data problem is missing data, as we alluded to. Missing data occur when a piece of information existed in reality but for some reason was not provided by the data supplier. […] After all, zero and nothing have a lot in common. However, there is a very different implication to the model thinking the price is now zero (for example, if we were long the instrument, we'd be showing a 100 percent loss on the position) versus thinking that the price is unknown at the moment. To fix this problem, many quants program their database and trading systems to recognize the difference between zero and blank.
Source: Rishi K. Narang. Inside the Black Box: A Simple Guide to Quantitative and High Frequency Trading. John Wiley & Sons, 2013. [B080]