Is there some formal way(s) of quantifying potential flaws, or risk, and ensuring there’s sufficient spread of tests to cover them? Perhaps using some kind of complexity measure? Or a risk assessment of some kind?

Experience tells me I need to be extra careful around certain things - user input, code generation, anything with a publicly exposed surface, third-party libraries/services, financial data, personal information (especially of minors), batch data manipulation/migration, and so on.

But is there any accepted means of formally measuring a system and ensuring that some level of test quality exists?

  • But is there any accepted means of formally measuring a system and ensuring that some level of test quality exists?

    Formally? No, this is basically impossible by Rice’s Theorem. There is not even a guarantee that if you have 100% test coverage, the program is good (the tests could be flawed).

    This is just a natural limitation of turing completeness. You can’t decide these properties while also having full computational power. In order to decide such things, you need a less powerful mode of computation (something not turing complete) that can be analyzed more thoroughly and with more guarantees.

      • Yea I’m afraid the only real way to “measure” that is to read through the tests and the code and make a good ol human value judgement on the state of the code and tests. But it won’t give you a number.