The parser functions accept source code and an optional configuration object,
and will generate
parse_module() is the most useful function here, since it accepts
the entire contents of a file and returns a new tree, but
parse_statement() are useful
when inserting new nodes into the tree, because they’re easier to use than the
equivalent node constructors.
>>> import libcst as cst >>> cst.parse_expression("1 + 2") BinaryOperation( left=Integer( value='1', lpar=, rpar=, ), operator=Add( whitespace_before=SimpleWhitespace( value=' ', ), whitespace_after=SimpleWhitespace( value=' ', ), ), right=Integer( value='2', lpar=, rpar=, ), lpar=, rpar=, )
- libcst.parse_module(source: Union[str, bytes], config: PartialParserConfig = PartialParserConfig()) Module ¶
Accepts an entire python module, including all leading and trailing whitespace.
If source is
bytes, the encoding will be inferred and preserved. If the source is a
string, we will default to assuming UTF-8 encoding if the module is rendered back out to source as bytes. It is recommended that when calling
parse_module()with a string you access the serialized code using
Module’s code attribute, and when calling it with bytes you access the serialized code using
Module’s bytes attribute.
- libcst.parse_expression(source: str, config: PartialParserConfig = PartialParserConfig()) BaseExpression ¶
Accepts an expression on a single line. Leading and trailing whitespace is not valid (there’s nowhere to store it on the expression node).
parse_expression()is provided mainly as a convenience function to generate semi-complex trees from code snippets. If you need to represent an expression exactly, including all leading/trailing comments, you should instead use
- libcst.parse_statement(source: str, config: PartialParserConfig = PartialParserConfig()) Union[SimpleStatementLine, BaseCompoundStatement] ¶
Accepts a statement followed by a trailing newline. If a trailing newline is not provided, one will be added.
parse_statement()is provided mainly as a convenience function to generate semi-complex trees from code snippetes. If you need to represent a statement exactly, including all leading/trailing comments, you should instead use
Leading comments and trailing comments (on the same line) are accepted, but whitespace (or anything else) after the statement’s trailing newline is not valid (there’s nowhere to store it on the statement node). Note that since there is nowhere to store leading and trailing comments/empty lines, code rendered out from a parsed statement using
cst.Module().code_for_node(statement)will not include leading/trailing comments.
- class libcst.PartialParserConfig¶
An optional object that can be supplied to the parser entrypoints (e.g.
parse_module()) to configure the parser.
Unspecified fields will be inferred from the input source code or from the execution environment.
>>> import libcst as cst >>> tree = cst.parse_module("abc") >>> tree.bytes b'abc' >>> # override the default utf-8 encoding ... tree = cst.parse_module("abc", cst.PartialParserConfig(encoding="utf-32")) >>> tree.bytes b'\xff\xfe\x00\x00a\x00\x00\x00b\x00\x00\x00c\x00\x00\x00'
- python_version: Union[str, AutoConfig]¶
The version of Python that the input source code is expected to be syntactically compatible with. This may be different from the Python interpreter being used to run LibCST. For example, you can parse code as 3.7 with a CPython 3.6 interpreter.
If unspecified, it will default to the syntax of the running interpreter (rounding down from among the following list).
Currently, only Python 3.0, 3.1, 3.3, 3.5, 3.6, 3.7 and 3.8 syntax is supported. The gaps did not have any syntax changes from the version prior.
- parsed_python_version: PythonVersionInfo¶
- encoding: Union[str, AutoConfig]¶
The file’s encoding format. When parsing a
bytesobject, this value may be inferred from the contents of the parsed source code. When parsing a
str, this value defaults to
- default_indent: Union[str, AutoConfig]¶
The indentation of the file, expressed as a series of tabs and/or spaces. This value is inferred from the contents of the parsed source code by default.
- class libcst.ParserSyntaxError¶
Contains an error encountered while trying to parse a piece of source code. This exception shouldn’t be constructed directly by the user, but instead may be raised by calls to
- message: str¶
A human-readable explanation of the syntax error without information about where the error occurred.
For a human-readable explanation of the error alongside information about where it occurred, use
- raw_column: int¶
The zero-indexed column as a number of characters from the start of the line where the error occured.
- __str__() str ¶
A multi-line human-readable error message of where the syntax error is in their code. For example:
Syntax Error @ 2:1. Incomplete input. Encountered end of file (EOF), but expected 'except', or 'finally'. try: pass ^
- property context: Optional[str]¶
A formatted string containing the line of code with the syntax error (or a non-empty line above it) along with a caret indicating the exact column where the error occurred.
Noneif there’s no relevant non-empty line to show. (e.g. the file consists of only blank lines)
- property editor_line: int¶
The expected one-indexed line in the user’s editor. This is the same as
- property editor_column: int¶
The expected one-indexed column that’s likely to match the behavior of the user’s editor, assuming tabs expand to 1-8 spaces. This is the column number shown when the syntax error is printed out with str.
This assumes single-width characters. However, because python doesn’t ship with a wcwidth function, it’s hard to handle this properly without a third-party dependency.
For a raw zero-indexed character offset without tab expansion, see