job.answiz.com
  • 3
Votes
name
name Punditsdkoslkdosdkoskdo

seds replaces all tabs and spaces with a single space

I got a string like the following:

test.de.          1547    IN      SOA     ns1.test.de. dnsmaster.test.de. 2012090701 900 1000 6000 600

now I want to replace all the tabs/spaces inbetween the records with just a single space so I can easily use it with cut -d " "

I tried the following:

sed "s/[	[:space:]]+/[:space:]/g"

and various varions but couldn't get it working. Any ideas?

I like using the following alias for bash. Building on what others wrote, use sed to search and replace multiple spaces with a single space. This helps get consistent results from cut. At the end, i run it through sed one more time to change space to tab so that it's easier to read.

alias ll='ls -lh | sed "s/ +/ /g" | cut -f5,9 -d" " | sed "s/ /	/g"'
  • 0
Reply Report

Use sed -e "s/[[:space:]]+/ /g"

Here's an explanation:

[   # start of character class

  [:space:]  # The POSIX character class for whitespace characters. It's
             # functionally identical to [ 	
] which matches a space,
             # tab, carriage return, newline, vertical tab, or form feed. See
             # https://en.wikipedia.org/wiki/Regular_expression#POSIX_character_classes

]   # end of character class

+  # one or more of the previous item (anything matched in the brackets).

For your replacement, you only want to insert a space. [:space:] won't work there since that's an abbreviation for a character class and the regex engine wouldn't know what character to put there.

The + must be escaped in the regex because with sed's regex engine + is a normal character whereas + is a metacharacter for 'one or more'. On page 86 of Mastering Regular Expressions, Jeffrey Friedl mentions in a footnote that ed and grep used escaped parentheses because "Ken Thompson felt regular expressions would be used to work primarily with C code, where needing to match raw parentheses would be more common than backreferencing." I assume that he felt the same way about the plus sign, hence the need to escape it to use it as a metacharacter. It's easy to get tripped up by this.

In sed you'll need to escape +?|(, and ). or use -r to use extended regex (then it looks like sed -r -e "s/[[:space:]]+/ /g" or sed -re "s/[[:space:]]+/ /g"

  • 0
Reply Report