Finding project directories in Python (data science) projects, just like there R rprojroot and here packages

Overview

Find relative paths from a project root directory

Finding project directories in Python (data science) projects, just like there R here and rprojroot packages.

Problem: I have a project that has a specific folder structure, for example, one mentioned in Noble 2009 or something similar to this project template, and I want to be able to:

  1. Run my python scripts without having to specify a series of ../ to get to the data folder.
  2. cd into the directory of my python script instead of calling it from the root project directory and specify all the folders to the script.
  3. Reference datasets from a root directory when using a jupyter notebook because everytime I use a jupyter notebook, the working directory changes to the location of the notebook, not where I launched the notebook server.

Solution: pyprojroot finds the root working directory for your project as a pathlib object. You can now use the here function to pass in a relative path from the project root directory (no matter what working directory you are in the project), and you will get a full path to the specified file. That is, in a jupyter notebook, you can write something like pandas.read_csv(here('./data/my_data.csv')) instead of pandas.read_csv('../data/my_data.csv'). This allows you to restructure the files in your project without having to worry about changing file paths.

Great for reading and writing datasets!

Installation

pip

pip install pyprojroot

conda

https://anaconda.org/conda-forge/pyprojroot

conda install -c conda-forge pyprojroot 

Usage

from pyprojroot import here

here()

Example

Load the packages

In [1]: from pyprojroot import here
In [2]: import pandas as pd

The current working directory is the "notebooks" folder

In [3]: !pwd
/home/dchen/git/hub/scipy-2019-pandas/notebooks

In the notebooks folder, I have all my notebooks

In [4]: !ls
01-intro.ipynb  02-tidy.ipynb  03-apply.ipynb  04-plots.ipynb  05-model.ipynb  Untitled.ipynb

If I wanted to access data in my notebooks I'd have to use ../data

In [5]: !ls ../data
billboard.csv  country_timeseries.csv  gapminder.tsv  pew.csv  table1.csv  table2.csv  table3.csv  table4a.csv  table4b.csv  weather.csv

However, with there here function, I can access my data all from the project root. This means if I move the notebook to another folder or subfolder I don't have to change the path to my data. Only if I move the data to another folder would I need to change the path in my notebook (or script)

In [6]: pd.read_csv(here('./data/gapminder.tsv'), sep='\t').head()
Out[6]:
       country continent  year  lifeExp       pop   gdpPercap
0  Afghanistan      Asia  1952   28.801   8425333  779.445314
1  Afghanistan      Asia  1957   30.332   9240934  820.853030
2  Afghanistan      Asia  1962   31.997  10267083  853.100710
3  Afghanistan      Asia  1967   34.020  11537966  836.197138
4  Afghanistan      Asia  1972   36.088  13079460  739.981106

By the way, you get a pathlib object path back!

In [7]: here('./data/gapminder.tsv')
Out[7]: PosixPath('/home/dchen/git/hub/scipy-2019-pandas/data/gapminder.tsv')
Comments
  • Refactor to align with rprojroot and here

    Refactor to align with rprojroot and here

    Major refactoring to allow the library to provide the same kind of functionality as both the rprojroot and here R libraries.

    There's still lots of functionality in rprojroot and here that isn't implemented in this PR, but it's a start.

    Closes #18, Closes #17

    opened by jamesmyatt 9
  • Continuous Integration

    Continuous Integration

    This project should have continuous integration (especially since the tests in v0.2.0 are broken), but there are lots of options, including:

    • Github actions
    • Azure pipelines
    • CircleCI
    • Travis CI
    • ...

    I can help set this up. Which one do you prefer?

    opened by jamesmyatt 4
  • Error on Import

    Error on Import

    Hi, I get the following error on import. What can I do?

    flo@comp:/projects/bla$ sudo pip3 install pyprojroot
    Requirement already satisfied: pyprojroot in /usr/local/lib/python3.5/dist-packages (0.1.1)
    flo@comp:/projects/bla$ python3
    Python 3.5.3 (default, Sep 27 2018, 17:25:39)
    [GCC 6.3.0 20170516] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> from pyprojroot import here
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/usr/local/lib/python3.5/dist-packages/pyprojroot/__init__.py", line 1, in <module>
        from .pyprojroot import *
      File "/usr/local/lib/python3.5/dist-packages/pyprojroot/pyprojroot.py", line 22
        warnings.warn(f"Path doesn't exist: {pth}")
                                                 ^
    SyntaxError: invalid syntax
    
    opened by r0f1 4
  • Add CI and make `mypy --strict` compliant

    Add CI and make `mypy --strict` compliant

    Add CI though GitHub Actions.

    This includes adding rules for test and lint to the Makefile and calling them in the added workflow. fmt has also been added to apply black formatting to the package and tests.

    Additionally the linting checks against mypy --strict, black and flake8

    opened by eganjs 3
  • Update test with explicit naming, change import (ordering) and minor code refactor.

    Update test with explicit naming, change import (ordering) and minor code refactor.

    Hello Daniël,

    To correct for the Rproject filename lacking a wildcard character and failing the tests the pyprojectroot.py file was updated. Besides that the pathlib is made to explicitly import the Path function and also has been refactored in the functions. This should fix #8 .

    The test code has been slightly refactored to also have more explicit comments and variables. Pathlib is utilized for path comparissons and the os module is only used for the chdir function, reducing the need to cast paths to string. Last but not least, rearranged the imports, to keep things pretty.

    Project is pushed from gitlab, so you'll see a different user in commits. 🤷‍♂ 🍰

    opened by rj-wilson 1
  • black, code style, fix tests

    black, code style, fix tests

    Turns out the issue is with the .Rproj files.

    The only change I had to make was to look for glob *.Rproj rather than .Rproj files only.

    Closes #8.

    $ pytest -v
    ============================================================================================ test session starts =============================================================================================
    platform darwin -- Python 3.7.3, pytest-5.1.2, py-1.8.0, pluggy-0.13.0 -- $HOME/anaconda/envs/pyprojroot-dev/bin/python
    cachedir: .pytest_cache
    rootdir: $HOME/github/software/pyprojroot
    collected 25 items
    
    tests/test_pyprojroot.py::test_version PASSED                                                                                                                                                          [  4%]
    tests/test_pyprojroot.py::test_here[stuff-.git] PASSED                                                                                                                                                 [  8%]
    tests/test_pyprojroot.py::test_here[stuff-.here] PASSED                                                                                                                                                [ 12%]
    tests/test_pyprojroot.py::test_here[stuff-my_project.Rproj] PASSED                                                                                                                                     [ 16%]
    tests/test_pyprojroot.py::test_here[stuff-requirements.txt] PASSED                                                                                                                                     [ 20%]
    tests/test_pyprojroot.py::test_here[stuff-setup.py] PASSED                                                                                                                                             [ 24%]
    tests/test_pyprojroot.py::test_here[stuff-.dvc] PASSED                                                                                                                                                 [ 28%]
    tests/test_pyprojroot.py::test_here[src-.git] PASSED                                                                                                                                                   [ 32%]
    tests/test_pyprojroot.py::test_here[src-.here] PASSED                                                                                                                                                  [ 36%]
    tests/test_pyprojroot.py::test_here[src-my_project.Rproj] PASSED                                                                                                                                       [ 40%]
    tests/test_pyprojroot.py::test_here[src-requirements.txt] PASSED                                                                                                                                       [ 44%]
    tests/test_pyprojroot.py::test_here[src-setup.py] PASSED                                                                                                                                               [ 48%]
    tests/test_pyprojroot.py::test_here[src-.dvc] PASSED                                                                                                                                                   [ 52%]
    tests/test_pyprojroot.py::test_here[data-.git] PASSED                                                                                                                                                  [ 56%]
    tests/test_pyprojroot.py::test_here[data-.here] PASSED                                                                                                                                                 [ 60%]
    tests/test_pyprojroot.py::test_here[data-my_project.Rproj] PASSED                                                                                                                                      [ 64%]
    tests/test_pyprojroot.py::test_here[data-requirements.txt] PASSED                                                                                                                                      [ 68%]
    tests/test_pyprojroot.py::test_here[data-setup.py] PASSED                                                                                                                                              [ 72%]
    tests/test_pyprojroot.py::test_here[data-.dvc] PASSED                                                                                                                                                  [ 76%]
    tests/test_pyprojroot.py::test_here[data/hello-.git] PASSED                                                                                                                                            [ 80%]
    tests/test_pyprojroot.py::test_here[data/hello-.here] PASSED                                                                                                                                           [ 84%]
    tests/test_pyprojroot.py::test_here[data/hello-my_project.Rproj] PASSED                                                                                                                                [ 88%]
    tests/test_pyprojroot.py::test_here[data/hello-requirements.txt] PASSED                                                                                                                                [ 92%]
    tests/test_pyprojroot.py::test_here[data/hello-setup.py] PASSED                                                                                                                                        [ 96%]
    tests/test_pyprojroot.py::test_here[data/hello-.dvc] PASSED                                                                                                                                            [100%]
    
    ============================================================================================= 25 passed in 0.21s =============================================================================================
    
    opened by ericmjl 1
  • Fix tests

    Fix tests

    #3 added tests to the repository, but after merging a few PRs the tests are broken. even if I git reset --hard d381ef9 I get the following errors

    $ pytest -k test_here
    ============================= test session starts ==============================
    platform linux -- Python 3.7.3, pytest-5.0.1, py-1.8.0, pluggy-0.12.0
    rootdir: /home/dchen/git/hub/pyprojroot
    plugins: doctestplus-0.3.0, arraydiff-0.3, openfiles-0.3.2, remotedata-0.3.1
    collected 25 items / 1 deselected / 24 selected                                
    
    tests/test_pyprojroot.py .....F.....F.....F.....F                        [100%]
    
    =================================== FAILURES ===================================
    ____________________________ test_here[stuff-.dvc] _____________________________
    
    self = PosixPath('/.git')
    
        def __str__(self):
            """Return the string representation of the path, suitable for
            passing to system calls."""
            try:
    >           return self._str
    E           AttributeError: _str
    
    /home/dchen/anaconda3/lib/python3.7/pathlib.py:697: AttributeError
    
    During handling of the above exception, another exception occurred:
    
    tmpdir = PosixPath('/tmp/pytest-of-dchen/pytest-3/test_here_stuff__dvc_0')
    proj_file = '.dvc', child_dir = 'stuff'
    
        @pytest.mark.parametrize(
            "proj_file",
            [
                ".git",
                ".here",
                "my_project.Rproj",
                "requirements.txt",
                "setup.py",
                ".dvc",
            ],
        )
        @pytest.mark.parametrize("child_dir", ["stuff", "src", "data", "data/hello"])
        def test_here(tmpdir, proj_file, child_dir):
            """
            This test uses pytest's tmpdir facilities to create a simulated project
            directory, and checks that the path is correct.
            """
            # Make proj_file
            tmpdir = Path(tmpdir)
            p = tmpdir / proj_file
            with p.open("w") as fpath:
                fpath.write("blah")
        
            # Make child dirs
            (tmpdir / child_dir).mkdir(parents=True)
            os.chdir(tmpdir / child_dir)
            assert os.getcwd() == str(tmpdir / child_dir)
        
            # Check that proj
    >       path = here()
    
    /home/dchen/git/hub/pyprojroot/tests/test_pyprojroot.py:40: 
    _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
    /home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:15: in here
        proj_path = pyprojroot(pl.Path('.').cwd(), proj_files)
    /home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
        return pyprojroot(p.parent, proj_files)
    /home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
        return pyprojroot(p.parent, proj_files)
    /home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
        return pyprojroot(p.parent, proj_files)
    /home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
        return pyprojroot(p.parent, proj_files)
    /home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
        return pyprojroot(p.parent, proj_files)
    /home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
        return pyprojroot(p.parent, proj_files)
    /home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
        return pyprojroot(p.parent, proj_files)
    E   RecursionError: maximum recursion depth exceeded while calling a Python object
    !!! Recursion detected (same locals & position)
    _____________________________ test_here[src-.dvc] ______________________________
    
    self = PosixPath('/.git')
    
        def __str__(self):
            """Return the string representation of the path, suitable for
            passing to system calls."""
            try:
    >           return self._str
    E           AttributeError: _str
    
    /home/dchen/anaconda3/lib/python3.7/pathlib.py:697: AttributeError
    
    During handling of the above exception, another exception occurred:
    
    tmpdir = PosixPath('/tmp/pytest-of-dchen/pytest-3/test_here_src__dvc_0')
    proj_file = '.dvc', child_dir = 'src'
    
        @pytest.mark.parametrize(
            "proj_file",
            [
                ".git",
                ".here",
                "my_project.Rproj",
                "requirements.txt",
                "setup.py",
                ".dvc",
            ],
        )
        @pytest.mark.parametrize("child_dir", ["stuff", "src", "data", "data/hello"])
        def test_here(tmpdir, proj_file, child_dir):
            """
            This test uses pytest's tmpdir facilities to create a simulated project
            directory, and checks that the path is correct.
            """
            # Make proj_file
            tmpdir = Path(tmpdir)
            p = tmpdir / proj_file
            with p.open("w") as fpath:
                fpath.write("blah")
        
            # Make child dirs
            (tmpdir / child_dir).mkdir(parents=True)
            os.chdir(tmpdir / child_dir)
            assert os.getcwd() == str(tmpdir / child_dir)
        
            # Check that proj
    >       path = here()
    
    /home/dchen/git/hub/pyprojroot/tests/test_pyprojroot.py:40: 
    _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
    /home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:15: in here
        proj_path = pyprojroot(pl.Path('.').cwd(), proj_files)
    /home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
        return pyprojroot(p.parent, proj_files)
    /home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
        return pyprojroot(p.parent, proj_files)
    /home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
        return pyprojroot(p.parent, proj_files)
    /home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
        return pyprojroot(p.parent, proj_files)
    /home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
        return pyprojroot(p.parent, proj_files)
    /home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
        return pyprojroot(p.parent, proj_files)
    /home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
        return pyprojroot(p.parent, proj_files)
    E   RecursionError: maximum recursion depth exceeded while calling a Python object
    !!! Recursion detected (same locals & position)
    _____________________________ test_here[data-.dvc] _____________________________
    
    self = PosixPath('/.git')
    
        def __str__(self):
            """Return the string representation of the path, suitable for
            passing to system calls."""
            try:
    >           return self._str
    E           AttributeError: _str
    
    /home/dchen/anaconda3/lib/python3.7/pathlib.py:697: AttributeError
    
    During handling of the above exception, another exception occurred:
    
    tmpdir = PosixPath('/tmp/pytest-of-dchen/pytest-3/test_here_data__dvc_0')
    proj_file = '.dvc', child_dir = 'data'
    
        @pytest.mark.parametrize(
            "proj_file",
            [
                ".git",
                ".here",
                "my_project.Rproj",
                "requirements.txt",
                "setup.py",
                ".dvc",
            ],
        )
        @pytest.mark.parametrize("child_dir", ["stuff", "src", "data", "data/hello"])
        def test_here(tmpdir, proj_file, child_dir):
            """
            This test uses pytest's tmpdir facilities to create a simulated project
            directory, and checks that the path is correct.
            """
            # Make proj_file
            tmpdir = Path(tmpdir)
            p = tmpdir / proj_file
            with p.open("w") as fpath:
                fpath.write("blah")
        
            # Make child dirs
            (tmpdir / child_dir).mkdir(parents=True)
            os.chdir(tmpdir / child_dir)
            assert os.getcwd() == str(tmpdir / child_dir)
        
            # Check that proj
    >       path = here()
    
    /home/dchen/git/hub/pyprojroot/tests/test_pyprojroot.py:40: 
    _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
    /home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:15: in here
        proj_path = pyprojroot(pl.Path('.').cwd(), proj_files)
    /home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
        return pyprojroot(p.parent, proj_files)
    /home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
        return pyprojroot(p.parent, proj_files)
    /home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
        return pyprojroot(p.parent, proj_files)
    /home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
        return pyprojroot(p.parent, proj_files)
    /home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
        return pyprojroot(p.parent, proj_files)
    /home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
        return pyprojroot(p.parent, proj_files)
    /home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
        return pyprojroot(p.parent, proj_files)
    E   RecursionError: maximum recursion depth exceeded while calling a Python object
    !!! Recursion detected (same locals & position)
    __________________________ test_here[data/hello-.dvc] __________________________
    
    self = PosixPath('/.git')
    
        def __str__(self):
            """Return the string representation of the path, suitable for
            passing to system calls."""
            try:
    >           return self._str
    E           AttributeError: _str
    
    /home/dchen/anaconda3/lib/python3.7/pathlib.py:697: AttributeError
    
    During handling of the above exception, another exception occurred:
    
    tmpdir = PosixPath('/tmp/pytest-of-dchen/pytest-3/test_here_data_hello__dvc_0')
    proj_file = '.dvc', child_dir = 'data/hello'
    
        @pytest.mark.parametrize(
            "proj_file",
            [
                ".git",
                ".here",
                "my_project.Rproj",
                "requirements.txt",
                "setup.py",
                ".dvc",
            ],
        )
        @pytest.mark.parametrize("child_dir", ["stuff", "src", "data", "data/hello"])
        def test_here(tmpdir, proj_file, child_dir):
            """
            This test uses pytest's tmpdir facilities to create a simulated project
            directory, and checks that the path is correct.
            """
            # Make proj_file
            tmpdir = Path(tmpdir)
            p = tmpdir / proj_file
            with p.open("w") as fpath:
                fpath.write("blah")
        
            # Make child dirs
            (tmpdir / child_dir).mkdir(parents=True)
            os.chdir(tmpdir / child_dir)
            assert os.getcwd() == str(tmpdir / child_dir)
        
            # Check that proj
    >       path = here()
    
    /home/dchen/git/hub/pyprojroot/tests/test_pyprojroot.py:40: 
    _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
    /home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:15: in here
        proj_path = pyprojroot(pl.Path('.').cwd(), proj_files)
    /home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
        return pyprojroot(p.parent, proj_files)
    /home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
        return pyprojroot(p.parent, proj_files)
    /home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
        return pyprojroot(p.parent, proj_files)
    /home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
        return pyprojroot(p.parent, proj_files)
    /home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
        return pyprojroot(p.parent, proj_files)
    /home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
        return pyprojroot(p.parent, proj_files)
    /home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
        return pyprojroot(p.parent, proj_files)
    /home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
        return pyprojroot(p.parent, proj_files)
    E   RecursionError: maximum recursion depth exceeded while calling a Python object
    !!! Recursion detected (same locals & position)
    ============== 4 failed, 20 passed, 1 deselected in 0.70 seconds ===============
    
    
    opened by chendaniely 1
  • Add Visual Studio Code config directory

    Add Visual Studio Code config directory

    This just adds the Visual Studio Code config directory as a possible indicator for the project root. As VSCode is also a very popular Python code editor this might make sense.

    opened by stlehmann 1
  • Add tests and more root files

    Add tests and more root files

    Dude, this is a great package. I'm already using it in my own projects at work! Keeps things so much more sane.

    I took the liberty of adding some tests. In particular, we test here that the root directories are found correctly.

    Through modding the source code, I learned that I can use a .here file as well. That's amazeballs!

    Also added .spyproject. Hope that helps. This would close #2.

    Finally, some other utilities to help conda-based developers get started are added too.

    In case you need confirmation on whether this works, at least it does on my computer:

    $ pytest -k test_here
    ======================================================== test session starts =========================================================
    platform darwin -- Python 3.6.7, pytest-4.4.1, py-1.7.0, pluggy-0.9.0
    hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/Users/ericmjl/github/software/pyprojroot/.hypothesis/examples')
    rootdir: /Users/ericmjl/github/software/pyprojroot
    plugins: remotedata-0.3.1, openfiles-0.3.1, doctestplus-0.2.0, arraydiff-0.2, hypothesis-4.17.2
    collected 25 items / 1 deselected / 24 selected                                                                                      
    
    tests/test_pyprojroot.py ........................                                                                              [100%]
    
    ============================================== 24 passed, 1 deselected in 0.22 seconds ===============================================
    
    opened by ericmjl 1
  • I think this is all i need for #22

    I think this is all i need for #22

    Should fix #22 Mainly followed the instructions listed in that issue that pointed to this page: https://hynek.me/articles/testing-packaging/#src

    To achieve that, you just move your packages into a src directory and add a where argument to find_packages() in your setup.py:

    setup(
        [...]
        packages=find_packages(where="src"),
        package_dir={"": "src"},
    )
    
    opened by chendaniely 0
  • python 2 compatability?

    python 2 compatability?

    I know python 2 is EOL but there are a lot of geospatial analysts clinging to it - python 2.x is still shipping with ArcGIS Desktop and people are reluctant to change, and this module could simplify a lot of hacky path issues.

    With that in mind, would you be interested in a PR to add python 2 backwards compatibility for pyprojroot? I'm happy to write and submit the changes, but if it's not something you want to add to the module, I understand.

    opened by joshpsawyer 0
  • #27 - mypy fixup, adds to #24 - unblocks #31

    #27 - mypy fixup, adds to #24 - unblocks #31

    This PR proposes fix the fix_mypy_strict branch:

    The 3 commits contain:

    • Change of the github action to use a "matrix build", simply taken from the examples and adapted to the project
    • Complete the type annotations in the source code (fixes #27)
    • Add __all__ to the __init__.py to define the public interface of this package.

    The next steps would be to merge the result into master and release it (#31).

    opened by achimgaedke 2
  • New Release?

    New Release?

    Hi there!

    I wonder whether it is possible to cut a new release after the major refactor last year.

    PyPi shows 0.2.0 from 2019 - https://pypi.org/project/pyprojroot/#history

    What can I do to help with this?

    opened by achimgaedke 5
  • Improve README

    Improve README

    I found an ITP on the debian-python list about your project. Sorry, I don't understand all details.

    Can you improve your README and explain a bit more detailed what the package does and what the advantage is here. Currently I don't see an advantage.

    I only see very simple path handling which could be done by or via pathlib itself also. What does your package add to the pathlib functionality?

    opened by buhtz 3
  • Make project-root files toggle-able (e.g. DVC

    Make project-root files toggle-able (e.g. DVC

    Currently the tool will find any path containing a .dvc folder, however there are times when it is necessary to initialize dvc in a subdirectory. This comes up when e.g. packaging the mydata.csv.dvc files via, say, importlib to let users treat datasets as dependencies in the code itself.

    On that note, dvc actually does this for you, with dvc.api.Repo.find_root(). It may be worth modularizing the assumptions here a bit, to let users assemble the "right" assumptions for what a "root directory" is, while providing sane defaults to the rest.

    opened by tbsexton 1
  • Fix mypy strict

    Fix mypy strict

    I did raise the original request for the mypy typing so I feel a bit responsible for the errors in the refactor, hopefully this helps with the dev branch!

    Don't feel like this PR actually needs to be merged. I am happy if it's just used for guidance, if that's even needed :)

    opened by eganjs 0
  • Re-implement `mypy strict` checking

    Re-implement `mypy strict` checking

    Please see #26

    @eganjs in #21 implemented mypy --strict checking, but after after refactor from #20, the mypy checks are failing. I'm pretty new to using mypy so I'm going to need help with setting up type hints...

    opened by chendaniely 0
Owner
Daniel Chen
bow ties are cool
Daniel Chen
A Streamlit web-app for a data-science project that aims to evaluate if the answer to a question is helpful.

How useful is the aswer? A Streamlit web-app for a data-science project that aims to evaluate if the answer to a question is helpful. If you want to l

null 1 Dec 17, 2021
Tuplex is a parallel big data processing framework that runs data science pipelines written in Python at the speed of compiled code

Tuplex is a parallel big data processing framework that runs data science pipelines written in Python at the speed of compiled code. Tuplex has similar Python APIs to Apache Spark or Dask, but rather than invoking the Python interpreter, Tuplex generates optimized LLVM bytecode for the given pipeline and input data set.

Tuplex 791 Jan 4, 2023
Demonstrate the breadth and depth of your data science skills by earning all of the Databricks Data Scientist credentials

Data Scientist Learning Plan Demonstrate the breadth and depth of your data science skills by earning all of the Databricks Data Scientist credentials

Trung-Duy Nguyen 27 Nov 1, 2022
A Pythonic introduction to methods for scaling your data science and machine learning work to larger datasets and larger models, using the tools and APIs you know and love from the PyData stack (such as numpy, pandas, and scikit-learn).

This tutorial's purpose is to introduce Pythonistas to methods for scaling their data science and machine learning work to larger datasets and larger models, using the tools and APIs they know and love from the PyData stack (such as numpy, pandas, and scikit-learn).

Coiled 102 Nov 10, 2022
Driver Analysis with Factors and Forests: An Automated Data Science Tool using Python

Driver Analysis with Factors and Forests: An Automated Data Science Tool using Python ??

Thomas 2 May 26, 2022
Lale is a Python library for semi-automated data science.

Lale is a Python library for semi-automated data science. Lale makes it easy to automatically select algorithms and tune hyperparameters of pipelines that are compatible with scikit-learn, in a type-safe fashion.

International Business Machines 293 Dec 29, 2022
Using Data Science with Machine Learning techniques (ETL pipeline and ML pipeline) to classify received messages after disasters.

Using Data Science with Machine Learning techniques (ETL pipeline and ML pipeline) to classify received messages after disasters.

null 1 Feb 11, 2022
Orchest is a browser based IDE for Data Science.

Orchest is a browser based IDE for Data Science. It integrates your favorite Data Science tools out of the box, so you don’t have to. The application is easy to use and can run on your laptop as well as on a large scale cloud cluster.

Orchest 3.6k Jan 9, 2023
Data Science Environment Setup in single line

datascienv is package that helps your to setup your environment in single line of code with all dependency and it is also include pyforest that provide single line of import all required ml libraries

Ashish Patel 55 Dec 16, 2022
Improving your data science workflows with

Make Better Defaults Author: Kjell Wooding [email protected] This is the git repo for Makefiles: One great trick for making your conda environments mo

Kjell Wooding 18 Dec 23, 2022
Open source platform for Data Science Management automation

Hydrosphere examples This repo contains demo scenarios and pre-trained models to show Hydrosphere capabilities. Data and artifacts management Some mod

hydrosphere.io 6 Aug 10, 2021
2019 Data Science Bowl

Kaggle-2019-Data-Science-Bowl-Solution - Here i present my solution to kaggle 2019 data science bowl and how i improved it to win a silver medal in that competition.

Deepak Nandwani 1 Jan 1, 2022
Statistical Analysis 📈 focused on statistical analysis and exploration used on various data sets for personal and professional projects.

Statistical Analysis ?? This repository focuses on statistical analysis and the exploration used on various data sets for personal and professional pr

Andy Pham 1 Sep 3, 2022
Projects that implement various aspects of Data Engineering.

DATAWAREHOUSE ON AWS The purpose of this project is to build a datawarehouse to accomodate data of active user activity for music streaming applicatio

null 2 Oct 14, 2021
BErt-like Neurophysiological Data Representation

BENDR BErt-like Neurophysiological Data Representation This repository contains the source code for reproducing, or extending the BERT-like self-super

null 114 Dec 23, 2022
A Big Data ETL project in PySpark on the historical NYC Taxi Rides data

Processing NYC Taxi Data using PySpark ETL pipeline Description This is an project to extract, transform, and load large amount of data from NYC Taxi

Unnikrishnan 2 Dec 12, 2021
Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data.

Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data.

Amundsen 3.7k Jan 3, 2023
Hidden Markov Models in Python, with scikit-learn like API

hmmlearn hmmlearn is a set of algorithms for unsupervised learning and inference of Hidden Markov Models. For supervised learning learning of HMMs and

null 2.7k Jan 3, 2023
Elementary is an open-source data reliability framework for modern data teams. The first module of the framework is data lineage.

Data lineage made simple, reliable, and automated. Effortlessly track the flow of data, understand dependencies and analyze impact. Features Visualiza

null 898 Jan 9, 2023