Pearpy - a Python package for writing multithreaded code and parallelizing tasks across CPU threads.

Overview

Build PyPI version

Pearpy

The Python package for (pear)allelizing your tasks across multiple CPU threads.

Installation

The latest version of Pearpy can be installed with:

pip install pearpy

To stay up to date with Pearpy's releases, visit the official page on PyPi or take a look at our GitHub Releases!

Usage

  1. Create a Pear() object. This will be a wrapper for all of your multithreaded processes.
  2. Identify the functions on which you would like to paralleilze computation.
  3. Add your tasks to the Pear. If a potential race condition is detected, target processes will be automatically locked.
  4. Run the paraellelized processes simultaneously.

Example

from pearpy.pear import Pear

# First function to be parallelized
def t1(num1, num2):
    print('t1: ', num1 + num2)

# Second function to be paralellized
def t2(num):
    print('t2: ', num)

# Create pear object, add threads, and run
pear = Pear()
pear.add_thread(t1, [4, 5])
pear.add_thread(t2, 4)
pear.run()

Race Condition Handling

When multiple threads utilize the same function, Pear will automatically generate locks for each resource. This allows developers to utilize Pear's multithreading without having to worry about inaccurate data caused by race conditions. The following example shows how race conditions are handled:

from pearpy.pear import Pear

global_var = 10

# This function reads from and writes to a global variable
def t_duplicated(num):
    global global_var
    print('t_duplicated: ', num + global_var)
    global_var += 1

# Pear object created with two threads accessing a shared resource
# A race condition is detected and locks are generated
pear = Pear()
pear.add_thread(t_duplicated, 1) # This should return 11 because 1 + 10 = 11
pear.add_thread(t_duplicated, 1) # This should return 12 because global_var is incremented
pear.run()

##########
# OUTPUT #
##########
t_duplicated: 11
t_duplicated: 12

Benchmarks and Tests

Benchmarks can be examined via the make benchmark command. This will display the threaded vs unthreaded runtimes on a set script, along with the percent improvement between the two. Here is an example of what the benchmarks should look like:

----------------------------------------------------------------------
THREADED BENCHMARK
3.8507602214813232 s
----------------------------------------------------------------------
UNTHREADED BENCHMARK
13.90523624420166 s
----------------------------------------------------------------------
Improvement:  361.1036638072611 %
.
----------------------------------------------------------------------
Ran 1 test in 17.757s

OK

To run tests, utilize the make test command. This will output the results of the functions called in the /tests/test_pear.py script, along with the status of the tests themselves. The console will display 'OK' if the tests are passing.

Contributing

Pear is open source and contributions from anyone are welcome. To contribute to this project, please submit issues and pull requests via GitHub. In order to successfully merge a pull request, all unit tests must be passed when run via make test. Thank you!

You might also like...
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.

Website | Documentation | Tutorials | Installation | Release Notes CatBoost is a machine learning method based on gradient boosting over decision tree

Here I plotted data for the average test scores across schools and class sizes across school districts.
Here I plotted data for the average test scores across schools and class sizes across school districts.

HW_02 Here I plotted data for the average test scores across schools and class sizes across school districts. Average Test Score by Race This graph re

A multithreaded view bot for YouTube
A multithreaded view bot for YouTube

Simple program to increase YouTube views written in Python.

A Proof-of-Concept Layer 2 Denial of Service Attack that disrupts low level operations of Programmable Logic Controllers within industrial environments. Utilizing multithreaded processing, Automator-Terminator delivers a powerful wave of spoofed ethernet packets to a null MAC address.
The producer-consumer problem implemented with threads in Python

This was developed using a Python virtual environment, I would strongly recommend to do the same if you want to clone this repository. How to run this

A python tool for synchronizing the messages from different threads, processes, or hosts.

Sync-stream This project is designed for providing the synchoronization of the stdout / stderr among different threads, processes, devices or hosts.

Python directory buster, multiple threads, gobuster-like CLI, web server brute-forcer, URL replace pattern feature.

pybuster v1.1 pybuster is a tool that is used to brute-force URLs of web servers. Features Directory busting (URI) URL replace patterns (put PYBUSTER

A bot framework for Reddit to manage threads, wiki pages, widgets, menus and more.

Sub Manager Sub Manager is a bot framework for Reddit to automate a variety of tasks on one or more subreddits, and can be configured and run without

City-seeds - A random generator of cultural characteristics intended to spark ideas and help draw threads

City Seeds This is a random generator of cultural characteristics intended to sp

UniLM AI - Large-scale Self-supervised Pre-training across Tasks, Languages, and Modalities

Pre-trained (foundation) models across tasks (understanding, generation and translation), languages (100+ languages), and modalities (language, image, audio, vision + language, audio + language, etc.)

Visualise Ansible execution time across playbooks, tasks, and hosts.
Visualise Ansible execution time across playbooks, tasks, and hosts.

ansible-trace Visualise where time is spent in your Ansible playbooks: what tasks, and what hosts, so you can find where to optimise and decrease play

ShadowClone allows you to distribute your long running tasks dynamically across thousands of serverless functions and gives you the results within seconds where it would have taken hours to complete

ShadowClone allows you to distribute your long running tasks dynamically across thousands of serverless functions and gives you the results within seconds where it would have taken hours to complete

Download images from forum threads

Forum Image Scraper Downloads images from forum threads Only works with forums which doesn't require a login to view and have an incremental paginatio

A mass creator for Discord's new channel threads.

discord-thread-flooder A mass creator for Discord's new channel threads. (obv created by https://github.com/imvast) Warning: this may lag ur pc if u h

Automatically re-open threads when they get archived, no matter your boost level!

ThreadPersist Automatically re-open threads when they get archived, no matter your boost level! Installation You will need to install poetry to run th

Keepalive - Discord Bot to keep threads from expiring

keepalive Discord Bot to keep threads from expiring Installation Create a new Di

This speeds up PyCharm's package index processes and avoids CPU & memory overloading

This speeds up PyCharm's package index processes and avoids CPU & memory overloading

Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech tagging and word segmentation.

Pytorch-NLU,一个中文文本分类、序列标注工具包,支持中文长文本、短文本的多类、多标签分类任务,支持中文命名实体识别、词性标注、分词等序列标注任务。 Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech tagging and word segmentation.

A python package to avoid writing and maintaining duplicated python docstrings.

docstring-inheritance is a python package to avoid writing and maintaining duplicated python docstrings.

Comments
  • Race Condition Handling

    Race Condition Handling

    Created a dictionary where the key is the name of the group of functions being executed to prevent busy waiting. For this I needed to change the name of the thread to the function name.

    opened by Mark-Nawar 0
  • Shared resources not handled properly

    Shared resources not handled properly

    To reproduce:

    • Create a function that manipulates and reads from a global variable
    • Create a pear object and add 2 threads that utilizes the function
    • Run the pear -> notice that race conditions are not handled properly
    opened by aidenszeto 0
Releases(v0.1.3)
Owner
MLH Fellowship
An internship alternative for software engineers, by Major League Hacking
MLH Fellowship
Pyccel stands for Python extension language using accelerators.

Pyccel stands for Python extension language using accelerators.

Pyccel 242 Jan 2, 2023
Sampling profiler for Python programs

py-spy: Sampling profiler for Python programs py-spy is a sampling profiler for Python programs. It lets you visualize what your Python program is spe

Ben Frederickson 9.5k Jan 1, 2023
Django query profiler - one profiler to rule them all. Shows queries, detects N+1 and gives recommendations on how to resolve them

Django Query Profiler This is a query profiler for Django applications, for helping developers answer the question "My Django code/page/API is slow, H

Django Query Profiler 116 Dec 15, 2022
guapow is an on-demand and auto performance optimizer for Linux applications.

guapow is an on-demand and auto performance optimizer for Linux applications. This project's name is an abbreviation for Guarana powder (Guaraná is a fruit from the Amazon rainforest with a highly caffeinated seed).

Vinícius Moreira 19 Nov 18, 2022
Django package to log request values such as device, IP address, user CPU time, system CPU time, No of queries, SQL time, no of cache calls, missing, setting data cache calls for a particular URL with a basic UI.

django-web-profiler's documentation: Introduction: django-web-profiler is a django profiling tool which logs, stores debug toolbar statistics and also

MicroPyramid 77 Oct 29, 2022
A multithreaded tool for searching and downloading images from popular search engines. It is straightforward to set up and run!

??️ CygnusX1 Code by Trong-Dat Ngo. Overviews ??️ CygnusX1 is a multithreaded tool ??️ , used to search and download images from popular search engine

DatNgo 32 Dec 31, 2022
Home repository for the Regularized Greedy Forest (RGF) library. It includes original implementation from the paper and multithreaded one written in C++, along with various language-specific wrappers.

Regularized Greedy Forest Regularized Greedy Forest (RGF) is a tree ensemble machine learning method described in this paper. RGF can deliver better r

RGF-team 364 Dec 28, 2022
Home repository for the Regularized Greedy Forest (RGF) library. It includes original implementation from the paper and multithreaded one written in C++, along with various language-specific wrappers.

Regularized Greedy Forest Regularized Greedy Forest (RGF) is a tree ensemble machine learning method described in this paper. RGF can deliver better r

RGF-team 363 Dec 14, 2022
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.

Website | Documentation | Tutorials | Installation | Release Notes CatBoost is a machine learning method based on gradient boosting over decision tree

CatBoost 6.9k Jan 4, 2023
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.

Website | Documentation | Tutorials | Installation | Release Notes CatBoost is a machine learning method based on gradient boosting over decision tree

CatBoost 5.7k Feb 12, 2021