Key Features
Unix Architecture page on Wikipedia
Files are stored on disk in a hierarchical file system, with a single top location throughout the system (root, or “/”), with both files and directories, subdirectories, sub-subdirectories, and so on below it.
With few exceptions, devices and some types of communications between processes are managed and visible as files or pseudo-files within the file system hierarchy. This is known as everything’s a file.
Doug McIlroy (inventor of Unix pipes)
This is the Unix philosophy: Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface
File system
Key differences from Windows
there are mount points instead of
A:
,C:
, etc.,directories and files are case sensitive, and
the separation character is
/
instead of\
What would appear as a separate media hierarchy in Windows (e.g., A:\MyDir\MyCode.c
) simply appears under a
separate directory (known as a mount point) in Unix (e.g., /media/disk/MyDir/MyCode.c
).
Root (/
)
/boot
- boot loader files/etc
- configuration files/dev
- device files/bin
- user programs required for booting/sbin
- system programs required for booting/lib{,32,64}
- libraries required for booting/usr
- programs, libraries, and such not required for booting/root
- superuser directory/home
- users directories (shared by all nodes)/tmp
- temporary files/var
- variable data (spool files, log files, etc.)/opt
- add on package directory/media
- mount point for removable media/proc
- process information pseudo-file system/sys
- system information pseudo-file system
User (/usr
)
The /usr
directory is split off from the /
directory mostly because disk space used to be precious.
/usr/bin
- user programs not required for booting/usr/sbin
- system programs not required for booting/usr/lib{,32,64}
- libraries not required for booting/usr/games
- game programs/usr/share
- architecture independent data/usr/man
- on-line manuals/usr/src
- source code/usr/include
- header files
User Local (/usr/local
)
The /usr/local
directory is a place to locally install programs without messing up /usr
.
/usr/bin
- user programs not required for booting/usr/sbin
- system programs not required for booting/usr/lib{,32,64}
- libraries not required for booting/usr/games
- game programs/usr/share
- architecture independent data/usr/man
- on-line manuals/usr/src
- source code/usr/include
- header files
Alliance supercomputers
/project
- group data files (shared by all nodes and all group members)/scratch
- user temporary data files (local to each cluster)
Devices
Some of the special /dev
files are
/dev/null
- discards all data written and provides no data/dev/zero
- provides a constant stream of NULL characters/dev/random
- provides a stream of random characters/dev/urandom
- provides a constant stream of pseudo-random characters
Commands
Programs are run by specifying the command followed by the arguments separated by spaces.
program
[
argument…]
By convention, arguments are switches followed by strings (e.g., regexps, paths, file names, etc.). Switches are
usually single dashes followed by letter for each switch or a double dash followed by a descriptive string (e.g.,
rm -fr mydir
or rm --force --recurse mydir
). Most commands also understand
-
- as a file name means read or write to the terminal--
- the end of switches and the start of the strings (in case the string needs to start with-
or--
).
Help
Traditionally man pages (a single help page) have been the de facto documentation source, however, some software
suites have been switching to info pages (a collection of hyperlinked pages). Help for the shell built in commands
is available by the built in help
.
man
command - on-line reference manualsapropos
[-a]
keyword … - search on-line reference manuals (same asman -k
)info
item - info documents
Directories
The current directory is .
and the parent directory is ..
.
pwd
- current directorycd
directory - change directorymkdir
directory - make directoryrmdir
directory - remove directory
Files
Files beginning with .
are considered hidden and not normally shown.
ls
[-a]
[-l]
destination - list filescp
[-a|-p]
[-r]
[-s]
source … destination - copy filesln
[-s]
target name - link to filemv
source … destination - move filesrm
[-r]
[-f]
destination … - remove files
Permissions
Standard permissions are r
ead, w
rite, and ex
ecute for u
ser, g
roup, and o
ther. They are frequently
abbreviated as three octal numbers (0=000, 1=001, 2=010, 3=011, 4=100, 5=101, 6=110, 7=111) corresponding to user
read, write, and execute; group read, write, and execute; other read, write, and execute.
For directories, r
ead allows the contents to be listed, w
rite allows files to be added or removed, and
ex
ecute allows the directory to be traversed.
chmod
[u|g|o|a]
…[+|-|=][r|w|x|X]
…[-R]
destination … - change mode (user/group/other permissions)chown
[-R]
user destination … - change ownerchgrp
[-R]
group destination … - change groupsetfacl
[-m|-x]
[-R]
[[u|g|o|m]...:
user:[r|w|x|X]...]
destination* … - set file access control list *(individual users)getfacl
destination … - get file access control list (individual users)
View Files
The space
key will advance a page and the q
key will quit in more
and less
. In addition, the arrow keys
will move in the appropriate direction in less
.
more
file - view one page at a timeless
file - view forward and backwardscat
[
file …]
- concatenate files in sequencehead
[-n lines][
file …]
- first part of filestail
[-n lines] [-f][
file …]
- last part of filespaste
[-d
deliminator]
[
file …]
- concatenate files in parallelcut
[-d
deliminator]
[-f
range]
[
file …]
- extract columnssort
[-g]
[-f]
[-u]
[
file …]
- sort lines
Comparison
Digests are numbers computed from the content of files such that it is extremely difficult to come up with two different files with the same number.
diff
[-w]
[-i]
[-u
number|-y]
file1 file2 - compare files line by linesdiff
[-W]
file1 file2 - compare files side by side (similar todiff -y
)md5sum
[
file …]
- compute MD5 digestsha256sum
[
file …]
- compute SHA256 digest
Searching
egrep
[-i]
[-v]
regexp[
file …]
- find lines matching regexp in files (same asgrep -E
)fgrep
[-i]
[-v]
strings[
file …]
- find lines matching strings in files (same asgrep -F
)find
directory … predicates - find files satisfying predicates in directories
Process
Each process (a running programs) is identified by a unique number.
ps
[-A|-U
user]
[-H]
[-f]
- process listkill
[-s
signal]
process … - signal processnohup
command - disconnect commandnice
command - low priority command
Remote
ssh
[
user@]
host[
command]
- login to remote systemscp
[[
user@]
host:]
source …[[
user@]
host:]
destination - copy remote filesunix2dos
file … - convert to DOS line breaksdos2unix
file … - convert to Unix line breaks
Other
sleep
seconds - waits given number of secondsecho
[-n]
[-e]
strings - prints stringstest
tests - perform various string (e.g., equality) of file (e.g., existence) tests
Editors
The two most popular Unix editors are vi
and emacs
. Both are extremely powerful and very complex. A simpler
editor is nano
.
vi
[
file …]
- common Unix editoremacs
[-nw]
[
file …]
- common Unix editornano
[
file …]
- simple Unix editor
Vi
Vi distinguishes between command and insert mode. Command mode allows you to move around and enter commands. Insert mode allows you to edit text.
:h
- help:w[!] [
file]
- write file (excalmation forces it):e
file - edit file:q[!]
- quit Vi (exclamation forces it):n[!]
- next file (excalmation forces it)[a|A]
- append after cursor or at end of line[i|I]
- insert (capital for beginning of line)[v|V]
- select to cursor or to end of line[c[w|c]|C]
- change selection/word/line or to end of line[d[w|d]|D]
- delete selection/word/line or to end of line[y[w|y]|Y]
- copy selection/word/line or to end of line[p|P]
- paste before or after cursor/lineJ
- join lines[u|U]
- undo (capital for current line)ESC
- revert to command mode
Emacs
Emacs is a more traditional single mode editor. Partially typed entries can be completed by pressing TAB (twice to list).
CTRL+h
- help (b
list keys andk
describes keys)CTRL+g
- abort current operationCTRL+[1|2|3]
- single window or split vertical/horizontal windowCTRL+x
CTRL+s
- save current bufferCTRL+x
CTRL+b
- switch current bufferCTRL+x
CTRL+k
- quit current bufferCTRL+x
CTRL+c
- quit EmacsCTRL+SPACE
- mark start of regionCTRL+w
- copy from start of region to cursorCTRL+y
- past copied regionCTRL+k
- delete to end of line or line if start of lineCTRL+s
- search for textCTRL+_
- undoALT+x
- enter command (TAB twice to list)
Command Line
The shell is a command line interpreter that lets users run programs. It proves ways to start programs and to manipulate/setup the context in which they run. The main parts of this are
arguments,
environment,
standard input (stdin),
standard output (stdout),
standard error (stderr), and
return value
A standard command looks like so
command
[<
stdinfile]
[>[>]
stdoutfile]
[2>[>]
stderrfile]
[&]
Arguments
Options passed to the program to tweak it’s behaviour. Traditionally switches (e.g., -xzf
or --extract
--gzip
--file
) followed by strings (e.g., regexp, paths, file names, etc.). Partially typed file names and
directories can be completed by pressing TAB (twice to list).
…
{
…}
… (brace expansion) - if not quoted, expands once for each comma separated list or once for each number in..
separated range~
… (tilde expansion) - if not quoted, expands to home directory of user following the tilde or the current user if no user specified${...}
(parameter and variable expansion) - if not single quoted, expands to environment variable specified or the corresponding parameter if number specified ({
and}
are not always required)$(...)
(command substitution) - if not single quoted, expands to output for command (`
…`
is an alternative syntax)$((...))
(arithmetic substitution) - if not single quoted, expands to evaluated result of the expression… (word splitting) - if not quoted, splits into separate arguments anywhere an IFS character (by default space, tab, and newline) occurs
…
[*|?|[
…]]
… (path name expansion) - if not quoted, is considered a pattern and replaced with matching file names (*
matches any string,?
matches any character, and[
…]
matches all the enclosed characters)
Quoting
Special characters can be escaped with \
to remove their special meaning. Single and double quoting strings
affect escaping as well as which expansions and substitutions are preformed.
'
…'
- no expansion or substitutions is preformed"
…"
- only escaping, parameter and variable expansion, command substitutions, and arithmetic substitutions occur
Environment
A set of key value pairs (e.g., USER=root
) that programs can look up and use. Each program gets a fresh copy
(i.e., changing it will not change the original) of all environment variables marked for export.
key
=
value - make a local environment variableexport
key[=
value]
- mark an environment variable for exportunset
key - delete an environment variable
Two important environment variables are
PATH
- list of:
separated directories to look for programs inLD_LIBRARY_PATH
- list of:
separated directories to look for libraries in (ahead of the system defaults specified in/etc/ld.so.conf
)
Input and Output
Programs are run with a standard place to read input from, a standard place to write output to, and a standard place to write error messages to. By default these are all the terminal window in which the program is run. This can be changed via
<
file - read standard input from file[>|>>]
file - write standard output to file (overwriting or appending)[2>|2>>]
file - write standard error to file (overwriting or appending)[&>|&>>]
file - write standard output and error to file (overwriting or appending)
Status
Programs return an integer exit status. The stats of the most recent executed foreground command is available as
$?
.
0 - program completed successfully
1…127 - program specific error code
128…255 - program terminated by signal 127+signal
Job Control
Programs run in the foreground by default. Background jobs will suspended if they require input. Existing jobs
will be sent SIGHUP
when the shell exits.
jobs
- list jobsfg
id - switch job to foregroundbg
id … - switch jobs to backgrounddisown
id … - release jobs from job control
Foreground jobs usually respond to the following key combinations
CTRL+Z - suspend program
CTRL+C - abort program
CTRL+D - end of input
Multiple Commands
Commands can be combined in several ways.
…
;
… - run first command and then second (same as pressing ENTER)…
&
… - run first command in background at the same time as second…
|
… - run first command in background with its output going to the second as input…
&&
… - run first command and then second only if first returns success…
||
… - run first command and then second only if first returns failure
Commands can be combined in several ways.
{
…}
- group command in current shell – has to end with;
or newline(
…)
- group command in sub shell – does not have to end with;
or newline
Scripting
Executable text files that start with #!
command (#!/bin/bash for shell scripts) are run as command file.
Parameters
$#
- number of parameters$0
- name of shell or shell script$
number - positional parameter$*
- all positional parameters (in double quotes expands as one argument)$@
- all positional parameters (in double quotes expands as separate arguments)
The following functions manipulate parameters
shift
[
number]
- drop specified number of parameters (one if unspecified)set
parameter … - set parameters to given parameters
Programming
if
command …; then
command …; [elif
command …; then
command …;]
…[else
command …;] fi
- conditionally run commands depending on successif
andelif
commandsfor
keyin
value …; do
command …; done
- for each value, set key to value and run commandswhile
command …; do
command …; done
- repeatedly run commands untilwhile
commands failcase
valuein [
pattern[|
pattern]
…)
command …;;]
…esac
- run commands where first pattern matches (same as path name expansion)continue
[
number]
- next iteration of enclosed loop (last if not specified)break
[
number]
- exit enclosed loop (last if not specified)function
name{
command; }
… - create a command that runs the commands with passed parametersreturn
[
number]
- return from function with given exit status (last command if not specified)exit
[
number]
- quit shell with given exit status (last command if not specified)
Regular Expressions
Regular expressions are strings where several of the non-alphanumeric characters have special meaning. They provide a concise and flexible means for string searching and replacing and are used by several Unix programs.
Anchoring
^
- match start of line$
- match end of line
Characters
character - the indicated character
.
- any character[
…]
- any character in the list or range (^
inverts)
Combining
(
…)
- group…
|
… - match either or
Repetition
?
- match zero or one times*
- match zero or more times+
- match one or more times{
…}
- match a range of times
Replacement
\
digit - substitute text matched by corresponding group