Compression is a process or technique to reduce the size of a data file. This is achieved using specific algorithms that identify patterns in the data to reduce its size.
Compression is divided into two types:
Before exploring compression, it is important to understand archiving.
Archiving is the process of collecting multiple files or directories into a single file. This archive can then be compressed using compression tools.
tar is a common archiving tool on Linux. Example usage:
tar -cvf archive.tar file1 file2 directory
In the example above:
-c
: Create a new archive
-v
: Verbose mode (shows detailed process)
-f
: Specifies the archive file name
You can add compression options:
-z
for gzip,
-j
for bzip2, or
-J
for xz.
Common compression tools on Linux:
| Compression Tool | Compression Algorithm |
|---|---|
| gzip | DEFLATE |
| bzip2 | Burrows-Wheeler |
| xz | LZMA |
| zip | DEFLATE |
gzip is a widely used compression utility using the DEFLATE algorithm.
gzip filename
→ Produces <code>filename.gz</code>
gunzip filename.gz # or gzip -d filename.gz
gzip -l filename.gz
bzip2 uses the Burrows-Wheeler algorithm for better compression than gzip.
bzip2 filename
→ Produces <code>filename.bz2</code>
bunzip2 filename.bz2 # or bzip2 -d filename.bz2
bzcat filename.bz2 | wc -c
xz uses the LZMA (Lempel-Ziv-Markov chain algorithm). It offers higher compression ratios but is slower and more resource-intensive.
xz filename
→ Produces <code>filename.xz</code>
unxz filename.xz
# or
xz -d filename.xz
xz -l filename.xz
zip is commonly used to compress and archive multiple files.
zip archive.zip file1 file2 folder1
zip -r archive.zip folder1
zip -u archive.zip file3
zip -r -e archive.zip folder1
→ Produces <code>archive.zip</code>
unzip archive.zip
unzip -l archive.zip
tar supports built-in compression in one command:
# Compress tar -czvf archive.tar.gz directory/ # Decompress tar -xzvf archive.tar.gz
# Compress tar -cjvf archive.tar.bz2 directory/ # Decompress tar -xjvf archive.tar.bz2
# Compress tar -cJvf archive.tar.xz directory/ # Decompress tar -xJvf archive.tar.xz
Explanation of flags:
-c
: Create archive
-x
: Extract archive
-z
: Use gzip
-j
: Use bzip2
-J
: Use xz
-v
: Verbose (detailed output)
-f
: Specify archive file name