No more code headers?

ThermosOfPain ( @ThermosOfPain@midwest.social ) · 11 months ago

No more code headers?

Sl00k ( @Sl00k@programming.dev ) · 11 months ago

I recently hired into a data analytics team

I work in Data Engineering and have spent most my time on analytics teams. They don’t have a SWE/CS background and generally because of that don’t follow any good programming practices. In my experience style guides are hard to get them to follow properly even if you set up SQLFluff for them., I can barely make them see the advantage of not committing directly to main (at least we’re using git). It’s very frustrating.

ThermosOfPain ( @ThermosOfPain@midwest.social ) · 11 months ago

Yep that’s us–maybe half of us have CS degrees.

The funny thing is that the pushback is coming from the “regular” development folks. At least we’re using git too :)

glad_cat ( @glad_cat@lemmy.sdf.org ) · edit-2 11 months ago

Yes, serious people write docs. I hate this bullshit about code that should be so good that it’s “auto-documenting.” It never happens in real life. Code is at best of average quality, but it needs documentation. At my previous job they had “guidelines” to make sure that code didn’t needed doc. It was a bad joke and we had the worst code I’ve ever seen.

I don’t have solutions for you though. You need a combo of documentation generation, code formatter (in the CI maybe, or before a commit), and code linters to check for errors.

jvisick ( @jvisick@programming.dev ) · 11 months ago

“Self-documenting” just means “(I thought) I understood it when I wrote it, so you should too”. In other words, it really means “I don’t want to document my code”

JackbyDev ( @JackbyDev@programming.dev ) · 11 months ago

I like it better when the docs are embedded in the code or alongside them. Everywhere I’ve worked it is a pain trying to find some random Confluence page or whatever where some API doc is.

Double_A ( @Double_A@discuss.tchncs.de ) · 11 months ago

Also if it’s not in the code, it will get outdated quickly and nobody will ever look at it. Separate docs are only really useful for main concepts that are not going to change that quickly.

pelotron ( @pelotron@midwest.social ) · edit-2 11 months ago

Hmm, do I want to open some external site/program to see my documentation or have it already in the code in front of me?

We use doxygen at my company and I think I’ve only ever opened it twice in 9 years.

glad_cat ( @glad_cat@lemmy.sdf.org ) · 11 months ago

Doxygen may be required in regulated industries like healthcare, banking, or robotics, but programmers never use it internally. The headers themselves are useful though and show that programmers take care of what they write even if they don’t read the generated HTML.

drdnl ( @drdnl@programming.dev ) · 11 months ago

A header might be useful, although there’s likely better ways to (not) document what each sql statement does.

But inline documentation? I’d suggest trying to work around that. Here’s an explanation as to why: https://youtu.be/Bf7vDBBOBUA

If possible, and as much as possible, things should simply make enough sense to be self documenting. With only the high level concepts actually documented. Everything else is at risk to be outdated or worse, confuse

TehPers ( @TehPers@beehaw.org ) · edit-2 11 months ago

Self-documenting code only documents what the code does, not why it does it. I can look at a well written method that populates a list with random elements from another list and go “I know what that does!” but reading the code doesn’t tell me the reason this code was written or why alternatives weren’t chosen.

In the case of Rust, it goes even a step further when working with unsafe code. Sure I know what invariants need to be held for unsafe code to be sound, but not everyone does, and it isn’t always clear why a particular assumption made in an unsafe block (the list has at least 5 elements, for example) can be made soundly.

RustySharp ( @RustySharp@programming.dev ) · 11 months ago

…what the code does, not why it does it

This is my issue with “it’s self documenting code!”. I’m a maintenance coder. I deal with people’s code long after they’re dead (or ragequit). Some are for control systems.

if (waterPressure_psi > 500) raise PipeMayBurstException. Okay, we’re dealing with water pressure, in psi unit, and if it’s too high, it may break the piping. Self documenting!!

Except that our pipes are rated for 1000psi. SO WHY THE 500?! Do we have one or two sites - out of hundreds - with lower rated pipes? I can double performance if we raise the threshold to 700, well within the safety tolerance, but AM I GONNA KILL SOMEONE when they upgrade to our latest controller??

glue_snorter ( @glue_snorter@lemmy.sdfeu.org ) · edit-2 11 months ago

Ugh, a Magic String (I call it that whatever the type)

FACILITY_MAX_PRESSURES = {
    "Durham": 1000,
    "Ipswich": 500,
    "Calne": 750,
}

max_pressure = list(sorted(
    FACILITY_MAX_PRESSURES.values()
))[-1]

if water_pressure > max_pressure:
    blah

Obviously it should really pull from facility management, but that’s a bunch of moving parts where a constant is how you’d prefer the code to work

Tbh it starts to look better to just define a constant and comment it.

RustySharp ( @RustySharp@programming.dev ) · edit-2 11 months ago

Tbh it starts to look better to just define a constant and comment it.

Well… if (waterPressure > MAX_PRESSURE_BEFORE_YOU_FLOOD_THE_WHOLE_TOWN_OF_IPSWICH_AND_CALNE) is pretty self-documenting. No comments needed.

drdnl ( @drdnl@programming.dev ) · 11 months ago

Although a bit long, I do like this almost impossible to ignore example of self documenting code :)

Double_A ( @Double_A@discuss.tchncs.de ) · edit-2 11 months ago

That’s because they are usuing magic numbers. If e.g. the 500 was MaximumPipeRating * SafetyMargin it would already be better.

Double_A ( @Double_A@discuss.tchncs.de ) · edit-2 11 months ago

If that list code is in a function called “PickRandomQuizQuestions” you would also know why it does that.

TehPers ( @TehPers@beehaw.org ) · 11 months ago

I encourage you to find a name for this function that describes why there is a second inner function. One restriction - the name of the function must be run (that’s what the trait being implemented calls it, you can’t rename it).

Sure, you can call the inner function run_inner_to_fix_rustc_issue_probably_caused_by_multiple_fnmut_impls but is that really any better than using two forward slashes to explain the context?

Double_A ( @Double_A@discuss.tchncs.de ) · 11 months ago

What do you mean by Code headers?

I hope you don’t mean those “Created by:” and “Last edited:” things… If yes, please don’t!

const_void ( @const_void@programming.dev ) · 11 months ago

Your friend may have a point.

It depends where the SQL is.

Is the SQL in a data model in an analytics platform? Some platforms will happily carry comments around like last week’s pizza during query generative phases of visualization, so it may not be appropriate to put comments inside a data model, as those comments could become bugs if the analytics platform is lame, like most are.

Others, certain flavors of SQL DDL (Tables, views, etc), comments outside the DDL don’t make it inside the resulting object, so headers may not be the right place either. Most RDBMS have meta-descriptors that can apply to DDL so those might be good to look at.

For arbitrary SQL, outside a brief inline comment describing why it exists, and what invokes it, your next best bet may be a link to a more descriptive data architecture diagram that shows how this unit of SQL integrates with others. You might prefer hyperlinked descriptions from that data architecture over searching thru code.

As long as comments don’t require continual parsing (a one-time tax is inconsequential), definitely add details that you have figured out so others don’t have to re-learn the tribal mysteries of long-deceased ancients.