cross-posted from: https://lemm.ee/post/4890334
cross-posted from: https://lemm.ee/post/4890282
let’s say I have this code
` #include #include char name[50]; int main(){ fgets(name,50,stdin); name[strcspn(name, “\n”)] = ‘\0’; printf(“hi %s”, name); }
` and I decide my name is “ewroiugheqripougheqpiurghperiugheqrpiughqerpuigheqrpiugherpiugheqrpiughqerpioghqe4r”, my program will throw some unexpected behavior. How would I mitigate this?
- CasualTee ( @CasualTee@beehaw.org ) 24•1 year ago
First of, especially in C, you should very carefully read the documentation of the functions you use. It then should be obvious to you you are currently misusing it on two accounts:
- You are not checking for errors
- You are assuming the presence of a \n that might never be there (this one leads to your unexpected behavior)
The manual tells you it will insert a \0 at the end of what it reads within the limits of the buffer. So this \0 is what you will need to look for when determining the size of the input.
If there is a \n, it will precede the \0. Just make sure the \0 is not at index 0 before trying to erase the \n. If there is no \n before the \0, you are in either of two cases (again, this is detailed in the documentation): the input is truncated (you did not read the full line, as in your unexpected behavior above) or you are reaching the end of the stream. Note that even if the stream ends with \n, you might need to issue an additional fgets to know you are at the end of the stream in which case a \0 will be placed as the first byte of your buffer.
If you really want to handle input that exceeds your initial buffer, then you need to dynamically allocate one and grow it as needed. A well behaved program will have an upper limit to the size of the input anyway (and this is why you don’t use gets). So you will need a combination of malloc/realloc and string concatenation. That means you need to learn all the pitfalls of dynamic memory allocation in C and how to use valgrind. For the string concatenation, even though strcat should be OK in your case, I’d recommend against it.
In order to use strcat properly you need to keep track of the usage of the dynamically allocated buffer by hand anyway because you want to know when you will attempt to store more bytes in the buffer than is currently allocated. And once you know the number of bytes stored in the buffer, copying over the bytes that fgets returns by hand is fairly trivial and has less pitfalls. This also circumvent one of the performance pitfall of strcat: it needs to find the \0 in the destination buffer for every call. So effectively, it can transform all by itself a trivial usecase such as yours, that one would expect to be linear in algorithmic complexity, to be of O(N^2) complexity.
On a final note: fgets does not allow you to handle binary data properly because you wont be able to tell apart a legitimate \0 coming from your input from a \0 inserted by fgets. So you will need to use fread in this case. I actually recommend using fread instead of fgets because it directly returns the number of bytes read, no need to use strlen to guess it and it makes error handling easier. Though you’ll need to add the trailing \0 yourself.
- DirigibleProtein ( @DirigibleProtein@aussie.zone ) 6•1 year ago
Can’t help with your code, but you should be aware of:
- hairyballs ( @hairyballs@programming.dev ) 6•1 year ago
The first article is funny, because I moved from my native country to the one right next to it, and everybody is confused by my name. They have one given name and 2 family names, while I have 4 first names, and a compound last name.
No need to travel to the other side of the planet to meet a different culture of naming.
- Chris ( @floppy@rabbitea.rs ) English2•1 year ago
Not a name, but something else to be aware of which could also happen with names.
- DirigibleProtein ( @DirigibleProtein@aussie.zone ) 2•1 year ago
Wow. Almost worth changing my name again just to fuck with them.
- nothacking ( @nothacking@discuss.tchncs.de ) 4•1 year ago
Your going to have to read the input one piece at a time, allocating a bigger buffer as you go. (realloc is the way to go here) I recommend putting the input reading code in a function do you can easily use it multiple times.