PySpark Stubs
A collection of the Apache Spark stub files. These files were generated by stubgen and manually edited to include accurate type hints.
Tests and configuration files have been originally contributed to the Typeshed project. Please refer to its contributors list and license for details.
Important
This project has been merged with the main Apache Spark repository (SPARK-32714). All further development for Spark 3.1 and onwards will be continued there.
For Spark 2.4 and 3.0, development of this package will be continued, until their official deprecation.
- If your problem is specific to Spark 2.3 and 3.0 feel free to create an issue or open pull requests here.
- Otherwise, please check the official Spark JIRA and contributing guidelines. If you create a JIRA ticket or Spark PR related to type hints, please ping me with [~zero323] or @zero323 respectively. Thanks in advance.
Motivation
Static error detection (see SPARK-20631)
Improved autocompletion.
Installation and usage
Please note that the guidelines for distribution of type information is still work in progress (PEP 561 - Distributing and Packaging Type Information). Currently installation script overlays existing Spark installations (pyi
stub files are copied next to their py
counterparts in the PySpark installation directory). If this approach is not acceptable you can add stub files to the search path manually.
According to PEP 484:
Third-party stub packages can use any location for stub storage. Type checkers should search for them using PYTHONPATH.
Moreover:
Third-party stub packages can use any location for stub storage. Type checkers should search for them using PYTHONPATH. A default fallback directory that is always checked is shared/typehints/python3.5/ (or 3.6, etc.)
Please check usage before proceeding.
The package is available on PYPI:
pip install pyspark-stubs
and conda-forge:
conda install -c conda-forge pyspark-stubs
Depending on your environment you might also need a type checker, like Mypy or Pytype [1], and autocompletion tool, like Jedi.
Editor | Type checking | Autocompletion | Notes |
---|---|---|---|
Atom |
|
|
Through plugins. |
IPython / Jupyter Notebook | ✘ [4] |
|
|
PyCharm |
|
|
|
PyDev |
|
? | |
VIM / Neovim |
|
|
Through plugins. |
Visual Studio Code |
|
|
Completion with plugin |
Environment independent / other editors |
|
|
Through Mypy and Jedi. |
This package is tested against MyPy development branch and in rare cases (primarily important upstrean bugfixes), is not compatible with the preceding MyPy release.
PySpark Version Compatibility
Package versions follow PySpark versions with exception to maintenance releases - i.e. pyspark-stubs==2.3.0 should be compatible with pyspark>=2.3.0,<2.4.0. Maintenance releases (post1, post2, ..., postN) are reserved for internal annotations updates.
API Coverage:
As of release 2.4.0 most of the public API is covered. For details please check API coverage document.
See also
- SPARK-17333 - Make pyspark interface friendly with static analysis.
- PySpark typing hints and Revisiting PySpark type annotations on Apache Spark Developers List.
Disclaimer
Apache Spark, Spark, PySpark, Apache, and the Spark logo are trademarks of The Apache Software Foundation. This project is not owned, endorsed, or sponsored by The Apache Software Foundation.
Footnotes
[1] | Not supported or tested. |
[2] | Requires atom-mypy or equivalent. |
[3] | Requires autocomplete-python-jedi or equivalent. |
[4] | It is possible to use magics to type check directly in the notebook. In general though, you'll have to export whole notebook to .py file and run type checker on the result. |
[5] | Requires PyDev 7.0.3 or later. |
[6] | TODO Using vim-mypy, syntastic or Neomake. |
[7] | With jedi-vim. |
[8] | With Mypy linter. |
[9] | With Python extension for Visual Studio Code. |
[10] | Just use your favorite checker directly, optionally combined with tool like entr. |
[11] | See Jedi editor plugins list. |