• It’s hard if you have unprotected, shared mutable state. If you use a language that uses immutable data structures (Haskell, Clojure, Erlang) it’s easy! If you use a language that won’t let you share mutable data without the required protection (Rust) it’s also easy! Everything else and you can be sure that even if it looks like it works, it most likely doesn’t.

    • Yep. I remember parallelising emarrasingly parallel C++ code once. I had an array for the results of each job, but forgot to give each job their own index in the array. So everyone wrote their results to the zeroeth entry in parallel. It was cache efficient this way, but the result was not what I expected.

    • Even if you work in permissive langauges like Java, if you start from the position of “this code will need to be thread safe”, its still not hard to do. The problem I’ve seen with more permsissive languages is that external libraries have bad systems built in and even if the code you write is threadsafe the external library csn introduce all sorts of weirdness.

      For example in Java, the SimpleDateFormat class which is a fairly standard way of parsing/outputting dates and is the go to example in all number of tutorials is not inherently threadsafe (and that’s part of the JDK!). Someone can write sensible stateless code and get tripped up by this.

    • The other way that it is easy is turn on autoparallel and/or autovectorize on your C compiler. Then no problem. You of course have to write code that can be usefully parallelized and/or vectorized by the compiler. A lot of stuff can be.