_drafts

Commandline notes

Just some general notes to remember helpful things on the command line…

Variable remove pattern

In a bash loop you often want to replace a part of a filename with something else. There is two easy functions to remove filename/variable parts:

  • Remove from front: ${var#pattern}
  • Remove from back: ${var%pattern}

For example, lets say you have a bunch of files with following the pattern front-[id number]-back.txt (front-10-back.txt, front-192-back.txt, etc.).

To rename while removing the front: for f in *.txt; do mv "$f" "${f#front-}"; done

To rename while removing the back and changing the file extension: for f in *.txt; do mv "$f" "${f%-back.txt}.md"; done

However, you can’t use both a front and back remove at the same time. To get around that, use an extra variable. To combine the other two commands:

for f in *.txt; do NAME="${f#front-}"; mv "$f" "${NAME%-back.txt}.md"; done

If you have spaces in the pattern, escape them. E.g. name with space use something like ${f#name\ with\ space}.

You can also use the syntax ${var/pattern/replacementstring} to replace things in the filename. If the “pattern” chunk starts with an extra /, then it replaces all matches (not just the first). This way you could remove all spaces from filenames in a directory and replace them with underscore like: for f in *\ *; do mv "$f" "${f// /_}"; done

Change String Case

Bash has a built in feature to convert the case of strings:

  • ^ convert 1st character to uppercase
  • ^^ convert whole string to uppercase
  • , convert 1st character to lowercase
  • ,, convert whole string to lowercase

This can be helpful for renaming all files with lowercase filenames and extensions. To do the renaming by copying to a new folder “renamed”:

for f in *.*; do cp "$f" "renamed/${f,,}"; done

This could also be done using the older tr utility using a much more complicated command:

for f in *.*; do cp "$f" "renamed/$( tr '[:upper:]' '[:lower:]' <<<"$f" )"; done

Combine CSVs

Remove headers and combine all files into one:

tail -n +1 *.csv > all.csv

Then, each file will be separated by ==> filename <==. Open in OpenRefine, use separator ==>, then use <== to create filename column.

Or use a loop:

for f in *.txt; do cat $f; echo "separator"; done > combo.txt

Count files of a specific type

Sometimes you need a count of how many files of a specific type are in a folder including sub-folders. A quick method is to use find and wc to get a count. In the directory you want to count:

find . -name "*.pdf" -type f | wc -l

Extract text from PDF

Use poppler-utils package command pdftotext (which is probably already installed on linux, can be installed with Xpdf tools on windows):

pdftotext filename outputname

for f in *.pdf; do pdftotext "$f"; done

Add something to each line

Use sed:

sed 's/^/stuff in front/; s/$/stuff at end/' "input.txt" > output.txt

In batch:

for f in *.txt; do sed 's/^/front/; s/$/back/' "$f" > output/"$f"; done

See Sed: an introduction for extensive tips.

Get list of files meeting criteria

I often have a folder of text files and need a list of which one contain some specific string. Easy using grep with the -l flag for listing filename only (not the matching lines):

grep -l "string" * > list.txt

Or the opposite, files without the phrase, using -L for filenames with no match:

grep -L "string" * > list.txt

Copy a list of files

I often have a folder of binary files, like images, where I need to sort out a small group to upload or move somewhere else. Often the list of files that need to move comes from some other source, such as a spreadsheet. To do it as a batch, I export a list of the filenames that need to move, one filename per line (just like if you did ls > list.txt), then use a bash loop like this:

for f in $(cat list.txt); do cp $f ../upload/; done

In this example, we use cat list.txt to print out the file contents to provide the variables for the loop to iterate over–this can be handy for a lot of pragmatic solutions!

Don’t store in history

If you add a space at the beginning of the line, then the command won’t be recorded in your history. This is important if you are manually adding some passwords or secrets in the command.

Use command output in another command

Sometimes you can’t pipe output into a command, you need the output inside the command instead (i.e. “command substitution”). Use $(command) in place. E.g. JUPYTER=$(which jupyter) julia (gives julia-lang location of Jupyter Notebook).

Handy Ref