Code: Select all
echo "10625,0SWXAEYA,8465,2022-10-13 03:35:57,/dev/sdx,30 °C,30 °C,88 %,43608 Hours,88 days,"1,779.51 TB",HP VO1920JEUQQ,0SWXAEYA,1787.98GB,done,NIST.SP.800-88(1 Pass),SAS"
This is your first problem.
"1,779.51 TB" means "close quotation, 1,779.51 TB, open quotation". Not what you expected, is it?
Regarding the pattern, doing it right is quite tricky, it may be a good case for using sed's hold space or nested substring capture combined with negative character matches.
E.g. [^,]* will match "all characters until the first coma". You can try something along the lines of:
\("\([^,"]*,\)*"\)
\1 should return "quoted value with all comas removed", while \2 should return value up to the first coma within previously opened quotation. Haven't tested, typing from memory.
Using those tricks with global flag you should be able to end up with an expression that will drop comas within all quoted fields and ignore comas separating the fields.
Alternatively, hold space allows you to process a string bit by bit, you can take advantage of the new line at the end of buffer to separate input from output during processing, and chew the quoted values one bite at a time.
Like in copy buffer to hold space, remove everything after second quote, remove comas after first quote, append to hold, copy hold to pattern space, remove everything up until second quote, repeat until your buffer starts with a new line, at which point you drop the new line and print the result.
Yes, sed allows you to run scripts, including conditional execution in a simple form of if-pattern-matched then goto label.
Check man sed for details.
Now, if you can guarantee there is only 1 field with a coma inside, this may or may not simplify the code. Either way, relying on this property of input data will give you a buggy script which will probably break at some point down the line, so while it may be tempting, I discourage that.
BTW, if you need to only do that once and don't care about whatever problems might arise in the future, simply don't write any script at all and just import that csv into calc. It understands quoted fields and will help you do the conversion.