Parsing
The parser functions accept source code and an optional configuration object,
and will generate CSTNode
objects.
parse_module()
is the most useful function here, since it accepts
the entire contents of a file and returns a new tree, but
parse_expression()
and parse_statement()
are useful
when inserting new nodes into the tree, because they’re easier to use than the
equivalent node constructors.
>>> import libcst as cst
>>> cst.parse_expression("1 + 2")
BinaryOperation(
left=Integer(
value='1',
lpar=[],
rpar=[],
),
operator=Add(
whitespace_before=SimpleWhitespace(
value=' ',
),
whitespace_after=SimpleWhitespace(
value=' ',
),
),
right=Integer(
value='2',
lpar=[],
rpar=[],
),
lpar=[],
rpar=[],
)
- libcst.parse_module(source: str | bytes, config: PartialParserConfig = PartialParserConfig()) Module [source]
Accepts an entire python module, including all leading and trailing whitespace.
If source is
bytes
, the encoding will be inferred and preserved. If the source is astring
, we will default to assuming UTF-8 encoding if the module is rendered back out to source as bytes. It is recommended that when callingparse_module()
with a string you access the serialized code usingModule
’s code attribute, and when calling it with bytes you access the serialized code usingModule
’s bytes attribute.
- libcst.parse_expression(source: str, config: PartialParserConfig = PartialParserConfig()) BaseExpression [source]
Accepts an expression on a single line. Leading and trailing whitespace is not valid (there’s nowhere to store it on the expression node).
parse_expression()
is provided mainly as a convenience function to generate semi-complex trees from code snippets. If you need to represent an expression exactly, including all leading/trailing comments, you should instead useparse_module()
.
- libcst.parse_statement(source: str, config: PartialParserConfig = PartialParserConfig()) SimpleStatementLine | BaseCompoundStatement [source]
Accepts a statement followed by a trailing newline. If a trailing newline is not provided, one will be added.
parse_statement()
is provided mainly as a convenience function to generate semi-complex trees from code snippetes. If you need to represent a statement exactly, including all leading/trailing comments, you should instead useparse_module()
.Leading comments and trailing comments (on the same line) are accepted, but whitespace (or anything else) after the statement’s trailing newline is not valid (there’s nowhere to store it on the statement node). Note that since there is nowhere to store leading and trailing comments/empty lines, code rendered out from a parsed statement using
cst.Module([]).code_for_node(statement)
will not include leading/trailing comments.
- class libcst.PartialParserConfig[source]
An optional object that can be supplied to the parser entrypoints (e.g.
parse_module()
) to configure the parser.Unspecified fields will be inferred from the input source code or from the execution environment.
>>> import libcst as cst >>> tree = cst.parse_module("abc") >>> tree.bytes b'abc' >>> # override the default utf-8 encoding ... tree = cst.parse_module("abc", cst.PartialParserConfig(encoding="utf-32")) >>> tree.bytes b'\xff\xfe\x00\x00a\x00\x00\x00b\x00\x00\x00c\x00\x00\x00'
- python_version: str | AutoConfig
The version of Python that the input source code is expected to be syntactically compatible with. This may be different from the Python interpreter being used to run LibCST. For example, you can parse code as 3.7 with a CPython 3.6 interpreter.
If unspecified, it will default to the syntax of the running interpreter (rounding down from among the following list).
Currently, only Python 3.0, 3.1, 3.3, 3.5, 3.6, 3.7 and 3.8 syntax is supported. The gaps did not have any syntax changes from the version prior.
- parsed_python_version: PythonVersionInfo
A named tuple with the
major
andminor
Python version numbers. This is derived frompython_version
and should not be supplied to thePartialParserConfig
constructor.
- encoding: str | AutoConfig
The file’s encoding format. When parsing a
bytes
object, this value may be inferred from the contents of the parsed source code. When parsing astr
, this value defaults to"utf-8"
.
Syntax Errors
- final class libcst.ParserSyntaxError[source]
Contains an error encountered while trying to parse a piece of source code. This exception shouldn’t be constructed directly by the user, but instead may be raised by calls to
parse_module()
,parse_expression()
, orparse_statement()
.This does not inherit from
SyntaxError
because Python’s may raise aSyntaxError
for any number of reasons, potentially leading to unintended behavior.- message: str
A human-readable explanation of the syntax error without information about where the error occurred.
For a human-readable explanation of the error alongside information about where it occurred, use
__str__()
(viastr(ex)
) instead.
- raw_column: int
The zero-indexed column as a number of characters from the start of the line where the error occured.
- __str__() str [source]
A multi-line human-readable error message of where the syntax error is in their code. For example:
Syntax Error @ 2:1. Incomplete input. Encountered end of file (EOF), but expected 'except', or 'finally'. try: pass ^
- property context: str | None
A formatted string containing the line of code with the syntax error (or a non-empty line above it) along with a caret indicating the exact column where the error occurred.
Return
None
if there’s no relevant non-empty line to show. (e.g. the file consists of only blank lines)
- property editor_line: int
The expected one-indexed line in the user’s editor. This is the same as
raw_line
.
- property editor_column: int
The expected one-indexed column that’s likely to match the behavior of the user’s editor, assuming tabs expand to 1-8 spaces. This is the column number shown when the syntax error is printed out with str.
This assumes single-width characters. However, because python doesn’t ship with a wcwidth function, it’s hard to handle this properly without a third-party dependency.
For a raw zero-indexed character offset without tab expansion, see
raw_column
.