Interactive online tutorial: Notebook

Scope Analysis

Scope analysis keeps track of assignments and accesses which could be useful for code automatic refactoring. If you’re not familiar with scope analysis, see Scope Metadata for more detail about scope metadata. This tutorial demonstrates some use cases of scope analysis. If you’re new to metadata, see Metadata Tutorial to get started. Given source codes, scope analysis parses all variable Assignment (or a BuiltinAssignment if it’s a builtin) and Access to store in Scope containers.

Note

The scope analysis only handles local variable name access and cannot handle simple string type annotation forward references. See Access

Given the following example source code contains a couple of unused imports (f, i, m and n) and undefined variable references (func_undefined and var_undefined). Scope analysis helps us identifying those unused imports and undefined variables to automatically provide warnings to developers to prevent bugs while they’re developing.

[2]:
source = """\
import a, b, c as d, e as f  # expect to keep: a, c as d
from g import h, i, j as k, l as m  # expect to keep: h, j as k
from n import o  # expect to be removed entirely

a()

def fun():
    d()

class Cls:
    att = h.something

    def __new__(self) -> "Cls":
        var = k.method()
        func_undefined(var_undefined)
"""

With a parsed Module, we construct a MetadataWrapper object and it provides a resolve() function to resolve metadata given a metadata provider. ScopeProvider is used here for analysing scope and there are three types of scopes (GlobalScope, FunctionScope and ClassScope) in this example.

[3]:
import libcst as cst


wrapper = cst.metadata.MetadataWrapper(cst.parse_module(source))
scopes = set(wrapper.resolve(cst.metadata.ScopeProvider).values())
for scope in scopes:
    print(scope)
<libcst.metadata.scope_provider.GlobalScope object at 0x7f64b80b7fe0>
<libcst.metadata.scope_provider.FunctionScope object at 0x7f64b43acc20>
<libcst.metadata.scope_provider.FunctionScope object at 0x7f64b43ac290>
<libcst.metadata.scope_provider.ClassScope object at 0x7f64b43adbe0>

Warn on unused imports and undefined references

To find all unused imports, we iterate through assignments and an assignment is unused when its references is empty. To find all undefined references, we iterate through accesses (we focus on Import/ImportFrom assignments) and an access is undefined reference when its referents is empty. When reporting the warning to developer, we’ll want to report the line number and column offset along with the suggestion to make it more clear. We can get position information from PositionProvider and print the warnings as follows.

[4]:
from collections import defaultdict
from typing import Dict, Union, Set

unused_imports: Dict[Union[cst.Import, cst.ImportFrom], Set[str]] = defaultdict(set)
undefined_references: Dict[cst.CSTNode, Set[str]] = defaultdict(set)
ranges = wrapper.resolve(cst.metadata.PositionProvider)
for scope in scopes:
    for assignment in scope.assignments:
        node = assignment.node
        if isinstance(assignment, cst.metadata.Assignment) and isinstance(
            node, (cst.Import, cst.ImportFrom)
        ):
            if len(assignment.references) == 0:
                unused_imports[node].add(assignment.name)
                location = ranges[node].start
                print(
                    f"Warning on line {location.line:2d}, column {location.column:2d}: Imported name `{assignment.name}` is unused."
                )

    for access in scope.accesses:
        if len(access.referents) == 0:
            node = access.node
            location = ranges[node].start
            print(
                f"Warning on line {location.line:2d}, column {location.column:2d}: Name reference `{node.value}` is not defined."
            )

Warning on line  1, column  0: Imported name `b` is unused.
Warning on line  1, column  0: Imported name `f` is unused.
Warning on line  2, column  0: Imported name `i` is unused.
Warning on line  2, column  0: Imported name `m` is unused.
Warning on line  3, column  0: Imported name `o` is unused.
Warning on line 15, column  8: Name reference `func_undefined` is not defined.
Warning on line 15, column 23: Name reference `var_undefined` is not defined.

Automatically Remove Unused Import

Unused import is a commmon code suggestion provided by lint tool like flake8 F401 imported but unused. Even though reporting unused import is already useful, with LibCST we can provide automatic fix to remove unused import. That can make the suggestion more actionable and save developer’s time.

An import statement may import multiple names, we want to remove those unused names from the import statement. If all the names in the import statement are not used, we remove the entire import. To remove the unused name, we implement RemoveUnusedImportTransformer by subclassing CSTTransformer. We overwrite leave_Import and leave_ImportFrom to modify the import statements. When we find the import node in lookup table, we iterate through all names and keep used names in names_to_keep. If names_to_keep is empty, all names are unused and we remove the entire import node. Otherwise, we update the import node and just removing partial names.

[5]:
class RemoveUnusedImportTransformer(cst.CSTTransformer):
    def __init__(
        self, unused_imports: Dict[Union[cst.Import, cst.ImportFrom], Set[str]]
    ) -> None:
        self.unused_imports = unused_imports

    def leave_import_alike(
        self,
        original_node: Union[cst.Import, cst.ImportFrom],
        updated_node: Union[cst.Import, cst.ImportFrom],
    ) -> Union[cst.Import, cst.ImportFrom, cst.RemovalSentinel]:
        if original_node not in self.unused_imports:
            return updated_node
        names_to_keep = []
        for name in updated_node.names:
            asname = name.asname
            if asname is not None:
                name_value = asname.name.value
            else:
                name_value = name.name.value
            if name_value not in self.unused_imports[original_node]:
                names_to_keep.append(name.with_changes(comma=cst.MaybeSentinel.DEFAULT))
        if len(names_to_keep) == 0:
            return cst.RemoveFromParent()
        else:
            return updated_node.with_changes(names=names_to_keep)

    def leave_Import(
        self, original_node: cst.Import, updated_node: cst.Import
    ) -> cst.Import:
        return self.leave_import_alike(original_node, updated_node)

    def leave_ImportFrom(
        self, original_node: cst.ImportFrom, updated_node: cst.ImportFrom
    ) -> cst.ImportFrom:
        return self.leave_import_alike(original_node, updated_node)

After the transform, we use .code to generate fixed code and all unused names are fixed as expected! The difflib is used to show only changed part and only import lines are updated as expected.

[6]:
import difflib
fixed_module = wrapper.module.visit(RemoveUnusedImportTransformer(unused_imports))

# Use difflib to show the changes to verify unused imports are removed as expected.
print(
    "".join(
        difflib.unified_diff(source.splitlines(1), fixed_module.code.splitlines(1))
    )
)
---
+++
@@ -1,6 +1,5 @@
-import a, b, c as d, e as f  # expect to keep: a, c as d
-from g import h, i, j as k, l as m  # expect to keep: h, j as k
-from n import o  # expect to be removed entirely
+import a, c as d  # expect to keep: a, c as d
+from g import h, j as k  # expect to keep: h, j as k

 a()