I have a JSON object with a huge array of nested objects. Let us assume it consists of records of license plates for vehicles. It would contain necessary fields like licenseID, issuingState, dateOfIssue, driverID etc.
What I am having problem with is how I should store data that is only used for exceptional cases, like a field for representing if the license plate is for foreign embassies (isEmbassyOwned) or if it is owned by a government entity (isGovernmentOwned) or if it is a learner license (isLearner) etc alongside fields with data types other than Boolean which would be empty or 0 and likewise when there is no information on that field. Let it be known that these exceptional scenarios would occur in less than 10% of total object instances.
I am facing confusion as to what format would be best for storing such type of data keeping balance between minimizing storage consumption and being human readable. Should I declare the fields for all objects regardless or only include them when they are not empty? Should I store them in a dedicated array instead, or maybe just introduce some code value to be used by a switch case operator in the interpreter? Or is there some other implementation I am not aware of?
4wd ( @fourwd@programming.dev ) 1•10 months agoWhat about using enums? In this case you will have to specify them for all records, but this ensures that the field will always be present.
enum license_owner { regular_citizen = 0, embassy, government, ... }
Ive heard about enums before, but I never really paid attention to them since I never got a need to use them in any of my projects till now. I think this is exactly what I need. Ill research more on it
Thank you so much for your help
TehPers ( @TehPers@beehaw.org ) English1•10 months agoIf they are mutually exclusive special cases, using an enum like another comment mentioned makes sense, and can limit the special cases to one field. You can use an enum of strings if you want it to be more readable.
As for how the data is represented, only including the special case field when there is one makes sense as well. Keep in mind JSON is also a flexible format - you can even have the array contain mixed types, like strings for simple licenses, and objects for more complex licenses. That can reduce the size of the JSON document quite a bit, if that’s an option.
Nomecks ( @Nomecks@lemmy.ca ) 1•10 months agoConvert the Jason to S3 keys and store it as a file structure
kamstrup ( @kamstrup@programming.dev ) 1•10 months agoDepending on your needs you can also break it into a columnar format with some standard compression on top. This allows you to search individual fields without looking at the rest.
It also compress exceptionally well, and “rare” fields will be null in most records, so run length encoding will compress them to near zero
See fx parquet