PyChemia, Python Framework for Materials Discovery and Design

Materials Discovery Group

Last update: Oct 2, 2022

Related tags

Overview

PyChemia, Python Framework for Materials Discovery and Design

PyChemia is an open-source Python Library for materials structural search. The purpose of the initiative is to create a method agnostic framework for materials discovery and design using a variety of methods from Minima Hoping to Soft-computing based methods. PyChemia is also a library for data-mining, using several methods to discover interesting candidates among the materials already processed.

The core of the library is the Structure python class, it is able to describe periodic and non-periodic structures. As the focus of this library is structural search the class defines extensive capabilities to modify atomic structures.

The library includes capability to read and write in several ab-initio codes. At the level of DFT, PyChemia support VASP, ABINIT and Octopus. At Tight-binding level development is in process to support DFTB+ and Fireball. This allows the library to compute electronic-structure properties using state-of-the-art ab-initio software packages and extract properties from those calculations.

Installation

You can install pychemia in several ways. We are showing 3 ways of installing PyChemia inside a Virtual environment. A virtual environment is a good way of isolating software packages from the pacakges installed with the Operating System. The decision on which method to use depends if you want to use the most recent code or the package uploaded from time to time to PyPi. The last method is particularly suited for developers who want to change the code and get those changes operative without an explicit instalation.

Installing with pip from pypi.org on a virtual environment

This method installs PyChemia from the packages uploaded to PyPi every month. It will provides a version of PyChemia that is stable.

First, create and activate the virtual environment. We are using the name pychemia_ve, but that is arbitrary.

virtualenv pychemia_ve
source pychemia_ve/bin/activate

When the virtual environment is activated, your prompt changes to (pychemia_ve)...$. Now, install pychemia with pip

python3 -m pip install pychemia

Installing with pip from a cloned repo on a virtual environment

This method installs PyChemia from the Github repo. The method will install PyChemia from the most recent sources.

First, create and activate the virtual environment. We are using the name pychemia_ve, but that is arbitrary.

virtualenv pychemia_ve
source pychemia_ve/bin/activate

Second, clone the repository from GitHub

git clone https://github.com/MaterialsDiscovery/PyChemia.git

Finally, install from the repo folder

python3 -m pip install PyChemia

Using PyChemia from repo folder on a virtual environment

This method is mostly used for development. In this way PyChemia is not actually installed and changes to the code will take inmediate effect.

First, create and activate the virtual environment. We are using the name pychemia_ve, but that is arbitrary.

virtualenv pychemia_ve
source pychemia_ve/bin/activate

Clone the repository

git clone https://github.com/MaterialsDiscovery/PyChemia.git

Go to repo folder, install Cython with pip and execute setup.py to build the Cython modules.

cd PyChemia
python3 -m pip install Cython
python3 setup.py build_ext --inplace
python3 setup.py build

Finally, install the packages required for PyChemia to work

python3 -m pip install -r requirements.txt

Set the variable $PYTHONPATH to point to PyChemia folder, in the case of bash it will be:

export PYTHONPATH=`path`

On C shell (csh or tcsh)

setenv PYTHONPATH `path`

PyChemia requirements

PyChemia relies on a number of other python packages to operate. Some of them are mandatory and they must be installed. Other packages are optional and their absence will only constrain certain capabilities.

Mandatory

Python >= 3.6 The library is tested on Travis for Python 3.6 up to 3.9 Support for Python 2.7 has been removed

https://travis-ci.org/MaterialsDiscovery/PyChemia
Numpy >= 1.19 Fundamental library for numerical intensive computation in Python. Numpy arrays are essential for efficient array manipulation.
SciPy >= 1.5 Used mostly for Linear Algebra, FFT and spatial routines.
Spglib >= 1.9 Used to determine symmetry groups for periodic structures
Matplotlib >= 3.3 Used to plot band structures, densities of states and other 2D plots
PyMongo >= 3.11 Used for structural search PyChemia relies strongly in MongoDB and its python driver. For the MongoDB server, any version beyond 3.11 should be fine. We have tested pychemia on MongoDB 4.0
psutil >= 5.8 Cross-platform lib for process and system monitoring in Python

Optional

nose >= 1.3.7 A python library for testing, simply go to the source directory and execute

nosetests -v
pytest Another utility for testing.
Pandas Library for Data Analysis used by the datamining modules
PyMC PyMC is a python module that implements Bayesian statistical models and fitting algorithms Important for the datamining capabilities of PyChemia
Mayavi >= 4.1 Some basic visualization tools are incorporated using this library
ScientificPython >2.6 This library is used for reading and writing NetCDF files
pymatgen >= 2.9 pymatgen is an excellent library for materials analysis
ASE Atomic Simulation Environment is another good library for ab-initio calculations. Quite impressive for the number of ab-initio packages supported
qmpy The Python library behind the Open Quantum Materials Database. The OQMD is a database of DFT calculated structures. For the time being the database contains more than 300000 structures, with more than 90% of them with the electronic ground-state computed.
coverage >= 4.0.1 Provides code coverage analysis
python-coveralls To submit coverage information to coveralls.io

https://coveralls.io/github/MaterialsDiscovery/PyChemia

Documentation

Instructions for installation, using and programming scripts with PyChemia can be found on two repositories for documentation:

Read The Docs:

http://pychemia.readthedocs.io/en/latest
Python Hosted:

http://pythonhosted.org/pychemia

Documentation is hosted on Read the Docs also available with Short URLs readthedocs and rtfd

Documentation is also hosted on Python Hosted

Sources

The main repository is on GitHub

Sources and wheel binaries are also distrubuted on PyPI or PyPI

Structure of the Library

Contributors

Prof. Aldo H. Romero [West Virginia University] (Project Director)
Guillermo Avendaño-Franco [West Virginia University] (Basic Infrastructure)
Adam Payne [West Virginia University] (Bug fixes (Populations, Relaxators, and KPoints) )
Irais Valencia Jaime [West Virginia University] (Simulation and testing)
Sobhit Singh [West Virginia University] (Data-mining)
Francisco Muñoz [Universidad de Chile] (PyPROCAR)
Wilfredo Ibarra Hernandez [West Virginia University] (Interface with MAISE)

Comments

conda-forge package
Hi,

I recently create the pychemia conda forge package, now it is possible to install PyChemia via:

conda install -c conda-forge pychemia

I was wondering if one of the core developers is interested in joining me in maintaining the package.

Best,

Jan
opened by jan-janssen 2
Installaltion problem

Dear Developers,

I am trying to install pychemia by

pip3 install pychemia --user

But it fails with following errors. Any help is appreciated.

ERROR: Complete output from command python setup.py egg_info: ERROR: Traceback (most recent call last): File "", line 1, in File "/tmp/pip-install-rd5hvw21/pychemia/setup.py", line 141, in data = write_version_py() File "/tmp/pip-install-rd5hvw21/pychemia/setup.py", line 89, in write_version_py release_data, FULLVERSION, GIT_REVISION = get_version_info() File "/tmp/pip-install-rd5hvw21/pychemia/setup.py", line 47, in get_version_info rf = open('setup.json') FileNotFoundError: [Errno 2] No such file or directory: 'setup.json' Using Cython: True ---------------------------------------- ERROR: Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-install-rd5hvw21/pychemia/

Thanks,

Best \Alex

opened by akentphonon 1
Removed pyximport from lennardjones since it only works if cython is installed.

Dear Guillermo,

Since setup.py already compiles the shared library required for lennardjones utilities for both cython and non-cython, this is unnecessary. It would be useful in the case that the shared library is not compiled and the import from the .pyx file needed to be imported directly without any compilation whatsoever.

With the current setup.py, keeping the .pyx and .c in the distribution package will make sure it will work when both cython is present and not present.

Best, Uthpala

opened by uthpalaherath 0
Fixed lennardjones error that arises when installing with pip

Hello Guillermo,

I added the lines: import pyximport pyximport.install()

to init.py and lj.py in pychemia/code/lennardjones to fix an import error that occurs when installing PyChemia with pip.

Best, Uthpala

opened by uthpalaherath 0
Created VaspXML object as a CodeOutput object, fixed a bug in reading…
… outcar, added writing to text and returning dict to DensityOfStates object

Created VaspXML object as a CodeOutput object. a lot of features was added such as total,projcted and parametric density of states.

fixed a bug in reading outcar energies

added writing to text and returning dict to DensityOfStates object
opened by petavazohi 0
Conserve repeating order of atoms in POSCAR -update

This update allows the repeating order in atoms in POSCAR to be conserved.

For example:

Sr V O 1.0 3.8465199999999999 0.0000000000000000 0.0000000000000000 0.0000000000000000 3.8465199999999999 0.0000000000000000 0.0000000000000000 0.0000000000000000 3.8465199999999999 Sr V Sr O 1 1 1 3 Direct 0.0000000000000000 0.0000000000000000 0.0000000000000000 0.5000000000000000 0.5000000000000000 0.5000000000000000 0.5000000000000000 0.5000000000000000 0.0000000000000000 0.5000000000000000 0.0000000000000000 0.5000000000000000 0.5000000000000000 0.5000000000000000 0.0000000000000000 0.0000000000000000 0.5000000000000000 0.5000000000000000

Here the order Sr, V, Sr, O will be conserved when using the structure to generate POTCARs and POSCARs for kgrid, encut convergence and relaxation with PyChemia. Otherwise, it reverts to Sr, V, O ignoring the order of the repetition.

This is helpful for performing calculations for heterostructures where atoms in layers are ordered separately. The order of the POTCAR concatenation will follow this too.

-Uthpala

opened by uthpalaherath 0
Conserve repeating order of atoms in POSCAR

This update allows the repeating order in atoms in POSCAR to be conserved.

For example:

Sr V O 1.0 3.8465199999999999 0.0000000000000000 0.0000000000000000 0.0000000000000000 3.8465199999999999 0.0000000000000000 0.0000000000000000 0.0000000000000000 3.8465199999999999 Sr V Sr O 1 1 1 3 Direct 0.0000000000000000 0.0000000000000000 0.0000000000000000 0.5000000000000000 0.5000000000000000 0.5000000000000000 0.5000000000000000 0.5000000000000000 0.0000000000000000 0.5000000000000000 0.0000000000000000 0.5000000000000000 0.5000000000000000 0.5000000000000000 0.0000000000000000 0.0000000000000000 0.5000000000000000 0.5000000000000000

Here the order Sr, V, Sr, O will be conserved when using the structure to generate POTCARs and POSCARs for kgrid, encut convergence and relaxation with PyChemia. Otherwise, it reverts to Sr, V, O ignoring the order of the repetition.

This is helpful for performing calculations for heterostructures where atoms in layers are ordered separately. The order of the POTCAR concatenation will follow this too.

-Uthpala

opened by uthpalaherath 0
free energy atom are changed to free energy per atom for convergence

changed the free energy in the convergence to free energy per atom. Also fixed the bug with kpoint converges best kpoint grid. previously it was taking the last calculated kgrid as the best kgrid.

chaged La and Ac group to 'd' in pychemia/utils/periodic as they don't have f electrons. I hope I'm write

opened by petavazohi 0
tutorial/example for "predict next experiment"
Be it composition- or structure-based, could you point to a place in the docs or provide an example that follows the general idea of:

from pychemia import Discover() mdl = Discover() mdl.fit(X_train) next_experiment = mdl.suggest_next_experiment()

I'm hoping to be able to compare PyChemia with mat_discover.
opened by sgbaird 2

updated relax.py to keep EDIFF provided by user and only use 1E-04 if not provided

Hello Guillermo,

In some of my structural relaxation I wanted to keep a lower EDIFF value so I added this in relax.py to use the EDIFF provided by the user in the INCAR. If not provided it will use the default value of EDIFF=1E-04.

I simply changed:

     # How to change EDIFF
        if vj.input_variables["EDIFF"] > -0.01 * vj.input_variables["EDIFFG"]:
            vj.input_variables["EDIFF"] = round_small(
                -0.01 * vj.input_variables["EDIFFG"]
            )
        else:
                vj.input_variables["EDIFF"] = 1e-4

        # How to change EDIFF
        if vj.input_variables["EDIFF"] > -0.01 * vj.input_variables["EDIFFG"]:
            vj.input_variables["EDIFF"] = round_small(
                -0.01 * vj.input_variables["EDIFFG"]
            )
        else:
            if self.extra_vars["EDIFF"]:
                vj.input_variables["EDIFF"] = self.extra_vars["EDIFF"]
            else:
                vj.input_variables["EDIFF"] = 1e-4

If you think it's something that is useful please merge it and add it to pip.

Thank you,

Best, Uthpala

opened by uthpalaherath 7

new branch

Hi, I am haidi. I feel the code framework of pychemia is the best one among all CSP software i know. But more thing need to be done to make it user friendly and high effiency. I have add some code into it , would you please open a devel branch for developing ?

opened by haidi-ustc 2

Releases(2021.09.15)

2021.09.15(Sep 15, 2021)

Pre-release version
Source code(tar.gz)
Source code(zip)
pychemia-0.21.9.15.tar.gz(33.77 MB)
pychemia-0.18.2.20(Feb 20, 2018)

Source code(tar.gz)
Source code(zip)
pychemia-0.18.2.20-py3-none-any.whl(457.51 KB)
pychemia-0.18.2.20.tar.gz(283.76 KB)

Owner

Materials Discovery Group

Materials Discovery

GitHub http://materialsdiscovery.github.io/PyChemia

PrimaryBid - Transform application Lifecycle Data and Design and ETL pipeline architecture for ingesting data from multiple sources to redshift

Transform application Lifecycle Data and Design and ETL pipeline architecture for ingesting data from multiple sources to redshift This project is composed of two parts: Part1 and Part2

1 Jan 19, 2022

Elementary is an open-source data reliability framework for modern data teams. The first module of the framework is data lineage.

Data lineage made simple, reliable, and automated. Effortlessly track the flow of data, understand dependencies and analyze impact. Features Visualiza

898 Jan 9, 2023

BioMASS - A Python Framework for Modeling and Analysis of Signaling Systems

Mathematical modeling is a powerful method for the analysis of complex biological systems. Although there are many researches devoted on produ

22 Dec 27, 2022

wikirepo is a Python package that provides a framework to easily source and leverage standardized Wikidata information

Python based Wikidata framework for easy dataframe extraction wikirepo is a Python package that provides a framework to easily source and leverage sta

35 Jan 4, 2023

Karate Club: An API Oriented Open-source Python Framework for Unsupervised Learning on Graphs (CIKM 2020)

Karate Club is an unsupervised machine learning extension library for NetworkX. Please look at the Documentation, relevant Paper, Promo Video, and Ext

1.8k Jan 9, 2023

Tuplex is a parallel big data processing framework that runs data science pipelines written in Python at the speed of compiled code

Tuplex is a parallel big data processing framework that runs data science pipelines written in Python at the speed of compiled code. Tuplex has similar Python APIs to Apache Spark or Dask, but rather than invoking the Python interpreter, Tuplex generates optimized LLVM bytecode for the given pipeline and input data set.

791 Jan 4, 2023

ETL flow framework based on Yaml configs in Python

ETL framework based on Yaml configs in Python A light framework for creating data streams. Setting up streams through configuration in the Yaml file.

18 Jul 6, 2022

An ETL framework + Monitoring UI/API (experimental project for learning purposes)

Fastlane An ETL framework for building pipelines, and Flask based web API/UI for monitoring pipelines. Project structure fastlane |- fastlane: (ETL fr

2 Jan 6, 2022

PLStream: A Framework for Fast Polarity Labelling of Massive Data Streams

PLStream: A Framework for Fast Polarity Labelling of Massive Data Streams Motivation When dataset freshness is critical, the annotating of high speed

4 Aug 2, 2022

Python Kalman filtering and optimal estimation library. Implements Kalman filter, particle filter, Extended Kalman filter, Unscented Kalman filter, g-h (alpha-beta), least squares, H Infinity, smoothers, and more. Has companion book 'Kalman and Bayesian Filters in Python'.

FilterPy - Kalman filters and other optimal and non-optimal estimation filters in Python. NOTE: Imminent drop of support of Python 2.7, 3.4. See secti

2.5k Dec 30, 2022

A Pythonic introduction to methods for scaling your data science and machine learning work to larger datasets and larger models, using the tools and APIs you know and love from the PyData stack (such as numpy, pandas, and scikit-learn).

This tutorial's purpose is to introduce Pythonistas to methods for scaling their data science and machine learning work to larger datasets and larger models, using the tools and APIs they know and love from the PyData stack (such as numpy, pandas, and scikit-learn).

102 Nov 10, 2022

Python Library for learning (Structure and Parameter) and inference (Statistical and Causal) in Bayesian Networks.

pgmpy pgmpy is a python library for working with Probabilistic Graphical Models. Documentation and list of algorithms supported is at our official sit

2.2k Dec 25, 2022

🧪 Panel-Chemistry - exploratory data analysis and build powerful data and viz tools within the domain of Chemistry using Python and HoloViz Panel.

???? ??. The purpose of the panel-chemistry project is to make it really easy for you to do DATA ANALYSIS and build powerful DATA AND VIZ APPLICATIONS within the domain of Chemistry using using Python and HoloViz Panel.

97 Dec 8, 2022

Example Of Splunk Search Query With Python And Splunk Python SDK

SSQAuto (Splunk Search Query Automation) Example Of Splunk Search Query With Python And Splunk Python SDK installation: ➜ ~ git clone https://github.c

1 Nov 14, 2021

A python package which can be pip installed to perform statistics and visualize binomial and gaussian distributions of the dataset

GBiStat package A python package to assist programmers with data analysis. This package could be used to plot : Binomial Distribution of the dataset p

4 Oct 17, 2022

ToeholdTools is a Python package and desktop app designed to facilitate analyzing and designing toehold switches, created as part of the 2021 iGEM competition.

ToeholdTools Category Status Repository Package Build Quality A library for the analysis of toehold switch riboregulators created by the iGEM team Cit

0 Dec 1, 2021

Python beta calculator that retrieves stock and market data and provides linear regressions.

Stock and Index Beta Calculator Python script that calculates the beta (β) of a stock against the chosen index. The script retrieves the data and resa

4 Jul 29, 2022

Larch: Applications and Python Library for Data Analysis of X-ray Absorption Spectroscopy (XAS, XANES, XAFS, EXAFS), X-ray Fluorescence (XRF) Spectroscopy and Imaging

Larch: Data Analysis Tools for X-ray Spectroscopy and More Documentation: http://xraypy.github.io/xraylarch Code: http://github.com/xraypy/xraylarch L

95 Dec 13, 2022

Python script to automate the plotting and analysis of percentage depth dose and dose profile simulations in TOPAS.

topas-create-graphs A script to automatically plot the results of a topas simulation Works for percentage depth dose (pdd) and dose profiles (dp). Dep

10 Dec 8, 2022

PyChemia, Python Framework for Materials Discovery and Design

Related tags

Overview

PyChemia, Python Framework for Materials Discovery and Design

Installation

Installing with pip from pypi.org on a virtual environment

Installing with pip from a cloned repo on a virtual environment

Using PyChemia from repo folder on a virtual environment

PyChemia requirements

Mandatory

Optional

Documentation

Sources

Structure of the Library

Contributors

Comments

Releases(2021.09.15)

2021.09.15(Sep 15, 2021)

pychemia-0.18.2.20(Feb 20, 2018)

Owner

Materials Discovery Group

PrimaryBid - Transform application Lifecycle Data and Design and ETL pipeline architecture for ingesting data from multiple sources to redshift

Elementary is an open-source data reliability framework for modern data teams. The first module of the framework is data lineage.

BioMASS - A Python Framework for Modeling and Analysis of Signaling Systems

wikirepo is a Python package that provides a framework to easily source and leverage standardized Wikidata information

Karate Club: An API Oriented Open-source Python Framework for Unsupervised Learning on Graphs (CIKM 2020)

Tuplex is a parallel big data processing framework that runs data science pipelines written in Python at the speed of compiled code

ETL flow framework based on Yaml configs in Python

An ETL framework + Monitoring UI/API (experimental project for learning purposes)

PLStream: A Framework for Fast Polarity Labelling of Massive Data Streams

Python Kalman filtering and optimal estimation library. Implements Kalman filter, particle filter, Extended Kalman filter, Unscented Kalman filter, g-h (alpha-beta), least squares, H Infinity, smoothers, and more. Has companion book 'Kalman and Bayesian Filters in Python'.

A Pythonic introduction to methods for scaling your data science and machine learning work to larger datasets and larger models, using the tools and APIs you know and love from the PyData stack (such as numpy, pandas, and scikit-learn).

Python Library for learning (Structure and Parameter) and inference (Statistical and Causal) in Bayesian Networks.

🧪 Panel-Chemistry - exploratory data analysis and build powerful data and viz tools within the domain of Chemistry using Python and HoloViz Panel.

Example Of Splunk Search Query With Python And Splunk Python SDK

A python package which can be pip installed to perform statistics and visualize binomial and gaussian distributions of the dataset

ToeholdTools is a Python package and desktop app designed to facilitate analyzing and designing toehold switches, created as part of the 2021 iGEM competition.

Python beta calculator that retrieves stock and market data and provides linear regressions.

Larch: Applications and Python Library for Data Analysis of X-ray Absorption Spectroscopy (XAS, XANES, XAFS, EXAFS), X-ray Fluorescence (XRF) Spectroscopy and Imaging

Python script to automate the plotting and analysis of percentage depth dose and dose profile simulations in TOPAS.