Hey there!

I’m a chemical physicist who has been using python (as well as matlab and R) for a lot of different tasks over the last ~10 years, mostly for data analysis but also to automate certain tasks. I am almost completely self-taught, and though I have gotten help and tips from professors throughout the completion of my degrees, I have never really been educated in best practices when it comes to coding.

I have some friends who work as developers but have a similar academic background as I do, and through them I have become painfully aware of how bad my code is. When I write code, it simply needs to do the thing, conventions be damned. I do try to read up on the “right” way to do things, but the holes in my knowledge become pretty apparent pretty quickly.

For example, I have never written a class and I wouldn’t know why or where to start (something to do with the init method, right?). I mostly just write functions and scripts that perform the tasks that I need, plus some work with jupyter notebooks from time to time. I only recently got started with git and uploading my projects to github, just as a way to try to teach myself the workflow.

So, I would like to learn to be better. Can anyone recommend good resources for learning programming, but perhaps that are aimed at people who already know a language? It’d be nice to find a guide that assumes you already know more than a beginner. Any help would be appreciated.

  • This is only tangentially related to improving your code directly as you have asked. However, in a similar vein as using source control (git), when using Python learn to manage your environments. Venv, poetry, conda/mamba, etc are tools to look into.

    I used to work with mostly scientists, and a good number of them knew some Python, but none of them knew how to properly manage their environments and it was a huge problem. They would often come to me and say “I ran this script a week ago and it worked, I tried it today without making any changes and it’s throwing this error now that I don’t understand.” Every time it was because they accidentally changed their dependencies, using their global python install. It also made it a nightmare to try to revive old code for them, since there was almost no way to know what version of various libraries were used.

    • This is huge. Unfortunately, as you indicated, there’s no standard tool for this and new ones are being added to the mix. Many in the science feilds are pushed towards Conda but I’m not sure it’s the best option. However, Conda will be infinitely better than not using anything to manage environments and dependencies.