I often approach this from a “upper bound” and “lower bound” scenario perspective. So suppose I am trying to extract a signal from data that is noisy and sparse. Often the most time consuming part of solving the problem is figuring out how to deal with all of the noise, and I don’t want to waste time doing that just to come up with nothing in the end.
A “upper bound” approach would be to simulate some data that is perfectly clean and meets all my model assumptions, and test my basic method on it. If it doesn’t work on the clean data, it won’t work on the noisy data.
The “lower bound” approach would be to try the simplest and dumbest thing I can think of to deal with the noise in the real data. If I can pick up some amount of signal even with the dumb approach, it makes me much more confident that if I spend time making a sophisticated noise model, it will be worth it.
A “upper bound” approach would be to simulate some data that is perfectly clean and meets all my model assumptions, and test my basic method on it. If it doesn’t work on the clean data, it won’t work on the noisy data.
The “lower bound” approach would be to try the simplest and dumbest thing I can think of to deal with the noise in the real data. If I can pick up some amount of signal even with the dumb approach, it makes me much more confident that if I spend time making a sophisticated noise model, it will be worth it.