Demystifying the TAR command

Published on 2/9/2020

Demystifying the TAR command

The cheatsheet

Untar/unzip a tar or tar.gz file into the current directory:

tar -xvf /path/to/archive

Zip/tar a directory into a compressed archive:

tar -cavf /path/to/archive.tar.gz /path/to/folder

The explanation

If you’re ever on Linux, tar might be one of those commands that you copy/paste every single time. Whether you’re following the guide to install Golang and you paste in the command that’s listed in the guide, or you encounter a stray tar.gz file and now you gotta look up the plethora of flags to figure out what you need.

Let’s demystify tar. You’re welcome to use this article as a guide or just as a cheatsheet. I’m writing it as both…for myself.

What is tar?

A tar or tarball is a single file that represents many files. You can call it an archive but before compression, it really was just a way to represent a file directory and collection of files as a single file.

You might be more familiar with a zip file (which is an archive but with compression) or a rar which is another compressed archive format.

On top of allowing us to transport entire directories as “one file”, tar also allowed fun stuff like checksum, providing a file size for the entire archive, and later, compression.

What is tar.gz?

A tar.gz is a compressed tar archive. If you ever see just .tar on an archive, it means that there’s no compression. With tar.gz (or tar.xz or tar.zst), you always expect compression. The gz itself might seem familiar and that’s because the tar archive was compressed via gzip.

Gzip is a compression tool but it is not an archive tool. So you can gzip a single file (web servers will gzip individual files before sending them over the wire) but not a collection of files. Since a tar is a collection of files put together to create a single file, gzip can compress that.

Unzipping/extracting tar and tar.gz files

The tar command breaks down to:

tar [options] file/path

It feels like there are an endless amount of options and I think that goes with any unix tool. It’s straightforward but so customizable.

The flags you need to remember are:

  1. x which tells tar to extract
  2. v for verbose output (recommended)
  3. f lets you specify the file you want to extract

Why the f? Because tar is unixy and can process data that’s been piped over.

For any .tar or .tar.gz file, the entire command is:

tar -xvf path/to/archive

This will extract current file into the current directory.

Zipping/tarring files

The tar command is very similar for compressing/archiving files but the command looks a bit different:

tar [options] /path/to/destination/archive.tar.gz path/to/file path/maybe/to/folder

tar adds every path you specify into the resulting archive so you don’t have to tar a single file or have to pre-group all files in a folder.

It also requires different flags:

  1. c tells tar to create an archive
  2. a tells tar to compress but only when the file extension calls tar to do it (explanation below)
  3. v verbose output
  4. f lets you pick the name of the archive to save into

For compression to work, you need to specify the extension on the archive matching the compression tool. So if you want your archive to be gzipped, make sure to add .gz to your archive name.

The entire command is:

tar -cavf /path/to/archive.tar.gz /paths/to/files/or/folders

Things to remember

You might have already noticed but the tar command simplifies a lot more or there’s a lot less to remember if you keep this in mind:

  1. use vf flags on all standalone commands
  2. x means extract
  3. c means create
  4. a means auto-compress (used only with c)