Installation and usage

In order to run this shell, you need Python 3.6. No other packages needed.

Example usage:

python3.6 -m pip install -r requirements.txt
cd ./cli/src/
python3.6 main.py

This package is continiously tested on Linux (using Travis CI) and Windows (using AppVeyor). Code coverage is beign run in Travis. This ensures stable and cross-platform pleasant user experience.

Shell module

The main module that uses other modules.

This module has a class Shell which does all the shell work:

  • read an input
  • call an appropriate module to preprocess input
  • call an appropriate module to lex input
  • call an appropriate module to parse lexemes
  • invoke the program represented by (sort of) AST.
class cli.shell.Shell[source]

The main shell REPL class.

Has an environment.Environment inside.

Create an instance and run main_loop if you want to interact with user, like this:

shell = Shell()
shell.main_loop()
apply_command_result(command_result)[source]

Take some programs result into account, i.e. change Shell’s state.

Change self.env according to command_result. This could be done in main_loop itself, but distinguishing this function makes testing easier.

main_loop()[source]

Infinite loop: prompt user input, show command output.

Reads from stdin, writes to stdout. Supports recovering from parsing and lexing errors.

process_input(inp)[source]

Take input string, parse it, run it.

Args:
inp (str): an input string
Returns:
commands.RunnableCommandResult.

Environment module

Abstraction of shell environment

class cli.environment.Environment[source]

A shell environment, in which commands run.

It is a collection of pairs <var_name, var_value> plus a current directory.

get_cwd()[source]

Get string representation of current working directory

get_var(name)[source]

Get variable value by name

set_cwd(dir_name)[source]

Set current working directory.

Do not check that the directory exists: we don’t really care, it is just a string.

Args:
dir_name (str): string representation of a new working
directory path.
set_var(name, value)[source]

Set variable value.

Args:
name (str): variable name. value (str): variable value (it should be string).

Commands module

Abstractions of commands and their results.

In principle, we have two types of commands:

  • a single command, like wc or cat. It accepts
    input, returns something, etc. This is represented by SingleCommand.
  • a combination of commands, like pwd | wc. It consists
    of several commands that interact with each other by some rules. This is represented by ChainCommand.

Since each command has the same interface (i.e. it can run, given input and environment), the above classes share a common base class, which represents an abstract command - RunnableCommand.

This module also contains an abstraction of a command run result.

class cli.commands.CommandAssignment(args_lst)[source]

An environment assignment.

Modifies the environment, reassigning one variable’s value.

Command args:
0 – string of the form var=value.

Assignemnt of quoted strings is not supported, e.g. x="a b c".

class cli.commands.CommandChain(cmd1, cmd2)[source]

A subset of commands: those which take two commands and combine them.

This is an abstract class.

class cli.commands.CommandChainPipe(cmd1, cmd2)[source]

Pipe: take one command’s output and put into other command as input.

The second command can ignore the input whatsoever. One can chain environment variable assignments using Pipe. All assignments will take place.

If the first command fails (i.e. completes with non-zero status), then the result of Pipe is equal to the result of the first command (i.e. the second command is not run).

Examples::
cat test.txt | wc echo 123 | pwd x=1 | y=2
class cli.commands.RunnableCommand[source]

An abstraction of every possible shell command (in broad sense).

A command can be run. This is an abstract class.

run(input_stream, env)[source]

Main action with the command: run it, given input and environment.

Args:
input_stream (streams.InputStream): an input for this command; env (environment.Environment): an environment in which command runs.
Returns:
RunnableCommandResult.
class cli.commands.RunnableCommandResult(output_stream, new_env, ret_code)[source]

Represents result of invoking a RunnableCommand.

It is more convenient to have a single class than a bunch of values (output, return code, etc.).

get_input_stream()[source]

Getter for the input stream

get_output()[source]

Read the command’s output.

Returns:
str: a string representation of output.
get_result_environment()[source]

Getter for the new environment

get_return_code()[source]

Getter for the return code

class cli.commands.SingleCommand(args_lst)[source]

A subset of commands: those which can execute on them own.

This is an abstract class.

Single commands module

Concrete SingleCommand’s and their Factory.

Since there can be many SingleCommand’s (cat, echo, wc, ..., you name it), it is convenient to have them in a separate module.

There is a Factory class that “knows” how to choose an appropriate Command given it’s string representation.

class cli.single_command.CommandCat(args_lst)[source]

cat command: print it’s input or file contents.

Command args:

0 – cat

1 (optional) – a filename. If provided, cat will
output that file’s contents. Otherwise, it will print it’s input. Can be relative or absolute.

Returns FILE_NOT_FOUND exit code if file was provided, but was not found.

class cli.single_command.CommandCd(args_lst)[source]

cd command: change directory.

Command args:
0 – cd 1 – a new filepath. Can be relative or absolute.

Returns NEW_DIR_INVALID if the directory does not exist. Returns BAD_NUMBER_OF_ARGS if wrong number of arguments is supplied.

class cli.single_command.CommandEcho(args_lst)[source]

echo command: prints it’s arguments.

Command args:

0 – echo

1..n – strings. Those strings are written to the output, space-separated.
A new line is then written (as seen in bash).
class cli.single_command.CommandExit(args_lst)[source]

exit command: exit shell.

Performs exiting via throwing an exception, so all further commands are not run.

class cli.single_command.CommandExternal(args_lst)[source]

An external command (not described in shell).

Command args:
0 – a command’s name. This name is searched
via joining the current working directory with the command name (i.e. it must be somewhere in the current directory).
1..n – command’s arguments. There can be any arguments
depending on a command.
class cli.single_command.CommandGrep(args_lst)[source]

grep command: find strings in a file.

Usage:
grep [-A <n>] [-i] [-w] PATTERN FILE
Arguments:
PATTERN regular expression pattern to search for FILE path to file where the search is performed
Options:
-i ignore case when searching
-w search for the whole word
-A <n> print n lines after match [default: 0]
class cli.single_command.CommandPwd(args_lst)[source]

pwd command: print the current working directory.

class cli.single_command.CommandWc(args_lst)[source]

wc command: count the number of words, characters and lines.

Command args:

0 – wc

1 (optional) – a filename. If provided, then wc will
count the number of characters in this file. Otherwise, it will take it’s input. Can be relative or absolute.

Returns FILE_NOT_FOUND exit code if file was provided, but was not found.

class cli.single_command.SingleCommandFactory[source]

A class that is responsible for building Single Commands.

This class knows which commands exist in shell.

static build_command(cmd_name_and_args_lexem_lst)[source]

Build a single command out of list of lexemes representing it’s arguments.

Args:
cmd_name_and_args_lexem_lst (list[lexer.Lexem]): a list
of lexemes. The first one must be STRING that represents a command name. The rest are STRING or QUOTED_STRING ‘s.

All string representations of lexemes are passed to a corresponding SingleCommand descendant.

Preprocessor module

A module with Preprocessor responsibility.

This module holds the Preprocessor - an entity that accepts raw string as input, and expands variables (in the form $x) according to an environment.

Preprocessing is a common action in programming languages, so we use it in interpreting Shell commands as well.

class cli.preprocessor.Preprocessor[source]

A static class for preprocessing a shell input string.

Given a raw string, we want to preprocess it, i.e. substitue things like $x into previously assigned value of x.

static substitute_environment_variables(raw_str, env)[source]

Do a one-time pass over string and substitute $x-like patterns.

Args:

raw_str (str): an initial, unprocessed string;

env (environment.Environment): an environment
in which this string must be expanded.
Returns:

str. The processed string. All substrings in single quotes are left untouched.

Inside double quotes, nonempty substrings starting with $ sign and ending in

  • space symbol
  • double or single quotes
  • $ sign

are treated as variable names. The values for these variables are queried from the input env.

Outside any quotation, similar rules apply: nonempty substrings that start with $ and end either

  • before the next space character
  • before the other $ sign
  • at the end of the input string
  • at the beginning of quotes (single or double)

are treated as variable names.

Example:

If the environment contains:

x=1
long_name=qwe

Then the following substitutions apply (nonexistant variables are substituted by an empty string):

echo "123$x"    -->         echo "1231"
echo "123$x "    -->        echo "1231 "
echo "123$xy "    -->       echo "123 "
echo "123$x dfg"    -->     echo "1231 dfg"
echo $long_name'123'  -->   echo qwe'123'
echo $long_name2'123'  -->  echo '123'
echo $x '123'  -->          echo 1 '123'
echo $x"qwe"    -->         echo 1"qwe"
echo $x$long_name  -->      echo 1qwe
echo `$x`"$x"  -->          echo `$x`"1"

Lexer module

A module with Lexer responsibility.

Lexing is a common step in language compilation or interpreting. A lexem is a group of tokens of the input stream, grouped by some “meaning”.

class cli.lexer.Lexem(tp, val, start_idx, end_idx)[source]

A single lexem.

A Lexem provides interface for querying it’s position in the string (for producing meaningful error messages), it’s type and for getting it’s string representation.

get_position()[source]

Return string representation of the position.

For example, if self._start_idx = 1 and self.end_idx = 5, then this function will return (1:5).

get_type()[source]

Return type of this lexem.

get_value()[source]

Return string representation of this lexem.

Quotes are stripped from QUOTED_STRING.

class cli.lexer.LexemType[source]

Possible types of Lexems that Lexer produces.

These are possible lexemes:

  • QUOTED_STRING is a sequence of characters inside double
    or single quotes;
  • STRING is a non-space sequence of characters;
  • ASSIGNMENT is a string of the form “smth=smth_other” (without quotes),
    smth_other can be empty;
  • PIPE is a | symbol.
class cli.lexer.Lexer[source]

A static class for lexing a preprocessed string.

Given a string, we want to split it into meaningful (more or less) tokens. Possible tokens are described in LexemType docstring.

static get_lexemes(raw_str)[source]

Scan the string left-to-right, output list of lexemes.

Args:
raw_str (str): a string to lex.
Returns:
list[Lexem] – the resulting lexemes.
Raises:
exceptions.LexException: if some quoted string started but never ends.

Parser module

A module with Parser responsibility.

Parsing is one of the later steps in program compilation or intepreting.

It ensures that the stream of lexemes form a valid program. The result is a tree that represents a program.

In our case, the result will be commands.RunnableCommand.

class cli.parser.Parser[source]

A static class for parsing a list of lexemes.

Parser ensures that a stream of lexemes match a syntactic structure of a valid command. It also builds a representation of this command alongway.

static build_command(lexemes)[source]

Build commands.RunnableCommand out of list of lexemes.

Our grammar is as following:

<start> ::= <command> (PIPE <command>)*
<command> ::= <assignment> | <single_command>
<assignment> ::= ASSIGNMENT
<single_command> ::= STRING (STRING | QUOTED_STRING | ASSIGNMENT)*

where ASSIGNMENT, QUOTED_STRING, STRING and PIPE are lexemes.

Every rule is implemented as a static method with name _parse_`smth`. It returns a pair:

  • a resulting commands.RunnableCommand
  • a list of unparsed lexemes

Streams module

Abstractions of command input and output.

Every command accepts some input and results in some output. This module contains abstractions on this ideas.

class cli.streams.InputStream[source]

An abstraction of command’s input.

A command can read from InputStream.

get_input()[source]

Read the whole input (as a string)

class cli.streams.OutputStream[source]

An abstraction of command’s output.

A command can write into OutputStream.

to_input_stream()[source]

Convert this OutputStream to an InputStream.

As a result of pipe, e.g. “echo 123 | wc”, a command’s output becomes another command’s input.

write(string)[source]

Write a string to output stream

write_line(string)[source]

Write a newline-trailed string to output stream

Exceptions module

A container for all shell-related exceptions

exception cli.exceptions.ExitException[source]

This exception is raised when exit command is executed

exception cli.exceptions.LexException[source]

An exception that occurs during lexing

exception cli.exceptions.ParseException[source]

An exception that occurs during parsing

exception cli.exceptions.ShellException[source]

A base class for all shell exceptions