Canonical path of a file in Bash

In Java, it's pretty straightfoward to take any abstract pathname (such as ~/Desktop/regina.jpg) and convert it to it's canonical pathname, /Users/jessewilson/Desktop/regina.jpg. Sometimes I find myself needing this function when writing simple shell scripts in Bash, and Google Groups showed me how.

Save the following text to a file called canonicalize, add it to your $PATH, and chmod it so it's executable:
#!/bin/bash
cd -P -- "$(dirname -- "$1")" &&
printf '%s\n' "$(pwd -P)/$(basename -- "$1")"


Then you can use the location of a partial filename, even if you change directories:
chixdiglinux:jessewilson$ canonicalize ./bash_profile
/Users/jessewilson/.bash_profile
chixdiglinux:jessewilson$ export JESSES_PROFILE=`canonicalize ./bash_profile`
chixdiglinux:jessewilson$ cd ~kevinmaltby
chixdiglinux:kevinmaltby$ diff $JESSES_PROFILE ./bash_profile
....

8 comments:

Adam Blinkinsop said...

Congrats, you're the first link on a Google search for "bash canonical path"!

On that note, there's a better way to do this:

readlink -f [FILE]

readlink is supposed to be used to "dereference" symbolic links, but it works just fine for normal paths. The -f flag displays the path even if the file itself doesn't exist.

Take a look at the man page for more information.

chris said...

In bash there's an option you can pass to pwd to dereference symbolic links in the path.

help pwd:

pwd: pwd [-LP]
Print the current working directory. With the -P option, pwd prints
the physical directory, without any symbolic links; the -L option
makes pwd follow symbolic links.

You could use this if you need full canonicalization of a path:

cd ~/foo; pwd -P

Anonymous said...

readlink -f doesn't work in new Leopard (OS X 10.5) installs it appears. It appears to work in upgraded installs of OS X 10.4 though.

pvgoran said...

readlink -f was broken by a recent update in my Gentoo system, as well - but only for a fusesmb filesystem; it seems to work fine with ext3.

Foogod said...

readlink is fine for systems that have it, but a lot still don't.

Your script is good, but it doesn't handle the case where the path you reference is actually a symlink itself.

Here's a slightly more robust version (I called mine "canonize"):

path="$1"

while [ -L "$path" ]; do
dir=`dirname "$path"`
path=`ls -l "$path" | sed -e 's/.* -> //'`
cd "$dir"
done

dir=`dirname "$path"`
file=`basename "$path"`
if [ ! -d "$dir" ]; then
echo "canonize: $dir: No such directory" >&2
exit 1
fi
cdir=`cd "$dir" && pwd -P`
printf "%s/%s\n" "$cdir" "$file"

Ferry said...

a better version still:

#
# Get the canonical path of a file or directory.
# When the file or directory itself is a link then this is not resolved.
#
# Note: the '&> /dev/null' is needed because the 'cd' command will also
# print the directory into which it changes when the CDPATH environment
# variable is set.
#
# 1: the file or directory
#
function path-canonical-simple() {
local dst="${1}"
cd -P -- "$(dirname -- "${dst}")" &> /dev/null && echo "$(pwd -P)/$(basename -- "${dst}")"
}


#
# Get the canonical path of a file or directory.
# When the file or directory itself is a link then this is also resolved.
#
# 1: the file or directory
#
function path-canonical() {
local dst="$(path-canonical-simple "${1}")"

while [[ -h "${dst}" ]]; do
local linkDst="$(ls -l "${dst}" | sed -r -e 's/^.*[[:space:]]*->[[:space:]]*(.*)[[:space:]]*$/\1/g')"
if [[ -z "$(echo "${linkDst}" | grep -E '^/')" ]]; then
# relative link destination
linkDst="$(dirname "${dst}")/${linkDst}"
fi
dst="$(path-canonical-simple "${linkDst}")"
done
echo "${dst}"
}

estani said...

Almost what I needed. In my case I finally use:
readlink -m [FILE]

to canonicalize even if some components in the path doesn't exists yet (not even the original post can handle that)

estani said...

Almost what I needed. In my case I finally use:
readlink -m [FILE]

to canonicalize even if some components in the path doesn't exists yet (not even the original post can handle that)