Hey there!

I’m a chemical physicist who has been using python (as well as matlab and R) for a lot of different tasks over the last ~10 years, mostly for data analysis but also to automate certain tasks. I am almost completely self-taught, and though I have gotten help and tips from professors throughout the completion of my degrees, I have never really been educated in best practices when it comes to coding.

I have some friends who work as developers but have a similar academic background as I do, and through them I have become painfully aware of how bad my code is. When I write code, it simply needs to do the thing, conventions be damned. I do try to read up on the “right” way to do things, but the holes in my knowledge become pretty apparent pretty quickly.

For example, I have never written a class and I wouldn’t know why or where to start (something to do with the init method, right?). I mostly just write functions and scripts that perform the tasks that I need, plus some work with jupyter notebooks from time to time. I only recently got started with git and uploading my projects to github, just as a way to try to teach myself the workflow.

So, I would like to learn to be better. Can anyone recommend good resources for learning programming, but perhaps that are aimed at people who already know a language? It’d be nice to find a guide that assumes you already know more than a beginner. Any help would be appreciated.

  • If you don’t already, use version control (git or otherwise) and try to write useful messages for yourself. 99% of the time, you won’t need them, but you’ll be thankful that 1% of the time. I’ve seen database engineers hack something together without version control and, honestly, they’d have looked far more professional if we could see recent changes when something goes wrong. It’s also great to be able to revert back to a known good state.

    Also, consider writing unit tests to prove your code does what you think it does. This is sometimes more useful for code you’ll use over and over, but you might find it helpful in complicated sections where your understanding isn’t great. Does the function output what it should or not? Start from some trivial cases and go from there.

    Lastly, what’s the nature of the code? As a developer, I have to live with my decisions for years (unless I switch jobs.) I need it to be maintainable and reusable. I also need to demonstrate this consideration to colleagues. That makes classes and modules extremely useful. If you’re frequently writing throwaway code for one-off analyses, those concepts might not be useful for you at all. I’d then focus more on correctness (tests) and efficiency. You might find your analyses can be performed far quicker if you have good knowledge about data structures and algorithms and apply them well. I’ve personally reworked code written by coworkers to be 10x more efficient with clever usage of data structures. It might be a better use of your time than learning abstractions we use for large, long-term applications.