Parsing

The parser functions accept source code and an optional configuration object, and will generate CSTNode objects.

parse_module() is the most useful function here, since it accepts the entire contents of a file and returns a new tree, but parse_expression() and parse_statement() are useful when inserting new nodes into the tree, because they’re easier to use than the equivalent node constructors.

>>> import libcst as cst
>>> cst.parse_expression("1 + 2")
BinaryOperation(
    left=Integer(
        value='1',
        lpar=[],
        rpar=[],
    ),
    operator=Add(
        whitespace_before=SimpleWhitespace(
            value=' ',
        ),
        whitespace_after=SimpleWhitespace(
            value=' ',
        ),
    ),
    right=Integer(
        value='2',
        lpar=[],
        rpar=[],
    ),
    lpar=[],
    rpar=[],
)

libcst.parse_module(source: str | bytes, config: PartialParserConfig = PartialParserConfig()) → Module[source]

Accepts an entire python module, including all leading and trailing whitespace.

If source is bytes, the encoding will be inferred and preserved. If the source is a string, we will default to assuming UTF-8 encoding if the module is rendered back out to source as bytes. It is recommended that when calling parse_module() with a string you access the serialized code using Module’s code attribute, and when calling it with bytes you access the serialized code using Module’s bytes attribute.

libcst.parse_expression(source: str, config: PartialParserConfig = PartialParserConfig()) → BaseExpression[source]: Accepts an expression on a single line. Leading and trailing whitespace is not valid (there’s nowhere to store it on the expression node). parse_expression() is provided mainly as a convenience function to generate semi-complex trees from code snippets. If you need to represent an expression exactly, including all leading/trailing comments, you should instead use parse_module().

libcst.parse_statement(source: str, config: PartialParserConfig = PartialParserConfig()) → SimpleStatementLine | BaseCompoundStatement[source]

Accepts a statement followed by a trailing newline. If a trailing newline is not provided, one will be added. parse_statement() is provided mainly as a convenience function to generate semi-complex trees from code snippetes. If you need to represent a statement exactly, including all leading/trailing comments, you should instead use parse_module().

Leading comments and trailing comments (on the same line) are accepted, but whitespace (or anything else) after the statement’s trailing newline is not valid (there’s nowhere to store it on the statement node). Note that since there is nowhere to store leading and trailing comments/empty lines, code rendered out from a parsed statement using cst.Module([]).code_for_node(statement) will not include leading/trailing comments.

class libcst.PartialParserConfig[source]

An optional object that can be supplied to the parser entrypoints (e.g. parse_module()) to configure the parser.

Unspecified fields will be inferred from the input source code or from the execution environment.

>>> import libcst as cst
>>> tree = cst.parse_module("abc")
>>> tree.bytes
b'abc'
>>> # override the default utf-8 encoding
... tree = cst.parse_module("abc", cst.PartialParserConfig(encoding="utf-32"))
>>> tree.bytes
b'\xff\xfe\x00\x00a\x00\x00\x00b\x00\x00\x00c\x00\x00\x00'

python_version: str | AutoConfig

The version of Python that the input source code is expected to be syntactically compatible with. This may be different from the Python interpreter being used to run LibCST. For example, you can parse code as 3.7 with a CPython 3.6 interpreter.

If unspecified, it will default to the syntax of the running interpreter (rounding down from among the following list).

Currently, only Python 3.0, 3.1, 3.3, 3.5, 3.6, 3.7 and 3.8 syntax is supported. The gaps did not have any syntax changes from the version prior.

parsed_python_version: PythonVersionInfo: A named tuple with the major and minor Python version numbers. This is derived from python_version and should not be supplied to the PartialParserConfig constructor.

encoding: str | AutoConfig: The file’s encoding format. When parsing a bytes object, this value may be inferred from the contents of the parsed source code. When parsing a str, this value defaults to "utf-8".

future_imports: FrozenSet[str] | AutoConfig: Detected __future__ import names

default_indent: str | AutoConfig: The indentation of the file, expressed as a series of tabs and/or spaces. This value is inferred from the contents of the parsed source code by default.

default_newline: str | AutoConfig: The newline of the file, expressed as \n, \r\n, or \r. This value is inferred from the contents of the parsed source code by default.

Syntax Errors

final class libcst.ParserSyntaxError[source]

Contains an error encountered while trying to parse a piece of source code. This exception shouldn’t be constructed directly by the user, but instead may be raised by calls to parse_module(), parse_expression(), or parse_statement().

This does not inherit from SyntaxError because Python’s may raise a SyntaxError for any number of reasons, potentially leading to unintended behavior.

message: str

A human-readable explanation of the syntax error without information about where the error occurred.

For a human-readable explanation of the error alongside information about where it occurred, use __str__() (via str(ex)) instead.

raw_line: int: The one-indexed line where the error occured.

raw_column: int: The zero-indexed column as a number of characters from the start of the line where the error occured.

__str__() → str[source]

A multi-line human-readable error message of where the syntax error is in their code. For example:

Syntax Error @ 2:1.
Incomplete input. Encountered end of file (EOF), but expected 'except', or 'finally'.

try: pass
         ^

property context: str | None

A formatted string containing the line of code with the syntax error (or a non-empty line above it) along with a caret indicating the exact column where the error occurred.

Return None if there’s no relevant non-empty line to show. (e.g. the file consists of only blank lines)

property editor_line: int: The expected one-indexed line in the user’s editor. This is the same as raw_line.

property editor_column: int

The expected one-indexed column that’s likely to match the behavior of the user’s editor, assuming tabs expand to 1-8 spaces. This is the column number shown when the syntax error is printed out with str.

This assumes single-width characters. However, because python doesn’t ship with a wcwidth function, it’s hard to handle this properly without a third-party dependency.

For a raw zero-indexed character offset without tab expansion, see raw_column.