An Aspiring Drop-In Replacement for NumPy at Scale

Overview

Legate NumPy

Legate NumPy is a Legate library that aims to provide a distributed and accelerated drop-in replacement for the NumPy API on top of the Legion runtime. Using Legate NumPy you do things like run the final example of the Python CFD course completely unmodified on 2048 A100 GPUs in a DGX SuperPOD and achieve good weak scaling.

drawing

Legate NumPy works best for programs that have very large arrays of data that cannot fit in the memory of a single GPU or a single node and need to span multiple nodes and GPUs. While our implementation of the current NumPy API is still incomplete, programs that use unimplemented features will still work (assuming enough memory) by falling back to the canonical NumPy implementation.

  1. Dependencies
  2. Usage and Execution
  3. Supported and Planned Features
  4. Supported Types and Dimensions
  5. Documentation
  6. Future Directions
  7. Known Bugs

Dependencies

Users must have a working installation of the Legate Core library prior to installing Legate NumPy.

Legate NumPy requires Python >= 3.6. We provide a conda environment file that installs all needed dependencies in one step. Use the following command to create a conda environment with it:

conda env create -n legate -f conda/legate_numpy_dev.yml

Installation

Installation of Legate NumPy is done with either setup.py for simple uses cases or install.py for more advanced use cases. The most common installation command is:

python setup.py --with-core 

This will build Legate NumPy against the Legate Core installation and then install Legate NumPy into the same location. Users can also install Legate NumPy into an alternative location with the canonical --prefix flag as well.

python setup.py --prefix  --with-core 

Note that after the first invocation of setup.py this repository will remember which Legate Core installation to use and the --with-core option can be omitted unless the user wants to change it.

Advanced users can also invoke install.py --help to see options for configuring Legate NumPy by invoking the install.py script directly.

Of particular interest to Legate NumPy users will likely be the option for specifying an installation of OpenBLAS to use. If you already have an installation of OpenBLAS on your machine you can inform the install.py script about its location using the --with-openblas flag:

python setup.py --with-openblas /path/to/open/blas/

Usage and Execution

Using Legate NumPy as a replacement for NumPy is easy. Users only need to replace:

import numpy as np

with:

import legate.numpy as np

These programs can then be run by the Legate driver script described in the Legate Core documentation.

legate legate_numpy_program.py

For execution with multiple nodes (assuming Legate Core is installed with GASNet support) users can supply the --nodes flag. For execution with GPUs, users can use the --gpus flags to specify the number of GPUs to use per node. We encourage all users to familiarize themselves with these resource flags as described in the Legate Core documentation or simply by passing --help to the legate driver script.

Supported and Planned Features

Legate NumPy is currently a work in progress and we are gradually adding support for additional NumPy operators. Unsupported NumPy operations will provide a warning that we are falling back to canonical NumPy. Please report unimplemented features that are necessary for attaining good performance so that we can triage them and prioritize implementation appropriately. The more users that report an unimplemented feature, the more we will prioritize it. Please include a pointer to your code if possible too so we can see how you are using the feature in context.

Supported Types and Dimensions

Legate NumPy currently supports the following NumPy types: float16, float32, float64, int16, int32, int64, uint16, uint32, uint64, bool, complex64, and complex128. Legate currently also only works on up to 3D arrays at the moment. We're currently working on support for N-D arrays. If you have a need for arrays with more than three dimensions please let us know about it.

Documentation

A complete list of available features can is provided in the API reference.

Future Directions

There are three primary directions that we plan to investigate with Legate NumPy going forward:

  • More features: we plan to identify a few key lighthouse applications and use the demands of these applications to drive the addition of new features to Legate NumPy.
  • We plan to add support for sharded file I/O for loading and storing large data sets that could never be loaded on a single node. Initially this will begin with native support for h5py but will grow to accommodate other formats needed by our lighthouse applications.
  • Strong scaling: while Legate NumPy is currently implemented in a way that enables weak scaling of codes on larger data sets, we would also like to make it possible to strong-scale Legate applications for a single problem size. This will require leveraging some of the more advanced features of Legion from inside the Python interpreter.

We are open to comments, suggestions, and ideas.

Known Bugs

  • Legate NumPy can exercise a bug in OpenBLAS when it is run with multiple OpenMP processors
  • On Mac OSX, Legate NumPy can trigger a bug in Apple's implementation of libc++. The bug has since been fixed but likely will not show up on most Apple machines for quite some time. You may have to manually patch your implementation of libc++. If you have trouble doing this please contact us and we will be able to help you.
Comments
  • Use conda compilers in dev envs?

    Use conda compilers in dev envs?

    This past week I struggled to get a working GASNet build locally, with either the MPI or UDP conduits. After basically and entire day of frustration, I was finally able to get things working only by using the conda compilers and openmpi packages from (conda-forge).

    Should we just add these to dev/test conda env files to use by default?

    I am not uncertain if there are any potential downsides to this, so I wanted to open this issue for discussion.

    cc @magnatelee @ipdemes @marcinz @trxcllnt

    opened by bryevdv 19
  • Build OpenBLAS with CROSS option to prevent tests at compile time

    Build OpenBLAS with CROSS option to prevent tests at compile time

    This change would prevent OpenBLAS from running tests during compilation. This is needed when building on machines or in conditions (e.g., using docker) that cause some of the OpenBLAS tests to fail. The potential downside of this is that we would want to run with OpenBLAS checks by default when users are building in the same environment in which they will run. If we want to run OpenBLAS tests by default, we could add an option to prevent testing at build time.

    opened by marcinz 18
  • legate numpy very slow compared to Python+Numpy

    legate numpy very slow compared to Python+Numpy

    I've been testing a simple Laplace Eq. solver to compare Python+Numpy to legate.numpy and legate is hugely slower than Numpy.

    The code is taken from: https://barbagroup.github.io/essential_skills_RRC/laplace/1/ . The code I actually run is the following:

    import numpy as np
    import time
    
    
    def L2_error(p, pn):
        return np.sqrt(np.sum((p - pn)**2)/np.sum(pn**2))
    # end if
    
    
    def laplace2d(p, l2_target):
        '''Iteratively solves the Laplace equation using the Jacobi method
    
        Parameters:
        ----------
        p: 2D array of float
            Initial potential distribution
        l2_target: float
            target for the difference between consecutive solutions
    
        Returns:
        -------
        p: 2D array of float
            Potential distribution after relaxation
        '''
    
        l2norm = 1.0
        icount = 0
        tot_time = 0.0
        pn = np.empty_like(p)
        while l2norm > l2_target:
    
            start = time.perf_counter()
    
            icount = icount + 1
            pn = p.copy()
            p[1:-1,1:-1] = .25 * (pn[1:-1,2:] + pn[1:-1, :-2] \
                                  + pn[2:, 1:-1] + pn[:-2, 1:-1])
    
            ##Neumann B.C. along x = L
            p[1:-1, -1] = p[1:-1, -2]     # 1st order approx of a derivative 
            l2norm = L2_error(p, pn)
            end = time.perf_counter()
    
            tot_time = tot_time + (end-start)
    
        # end while
    
        print("l2norm = ",l2norm)
        print("icount = ",icount)
        print("Total Iteration Time = ",tot_time)
        print("   Time per iteration = ",tot_time/icount)
    
        return p
    # end if
    
    
    
    if __name__ == "__main__":
    
        nx = 401
        ny = 401
    
        # Initial conditions
        p = np.zeros((ny,nx)) ##create a XxY vector of 0's
    
        # Dirichlet boundary conditions
        x = np.linspace(0,1,nx)
        p[-1,:] = np.sin(1.5*np.pi*x/x[-1])
        del x
    
    
        start = time.time()
        p = laplace2d(p.copy(), 1e-8)
        stop = time.time()
    
        print("Elapsed time = ",(stop-start)," secs")
        print(" ")
    
    
    # end if
    

    When I run it on my laptop with Anaconda Python3 and Numpy I get the following:

    $ python3 jacobi.py 
    l2norm =  9.99986062249016e-09
    icount =  153539
    Total Iteration Time =  127.02529454990054
       Time per iteration =  0.0008273161512703648
    Elapsed time =  127.14257955551147  secs
    

    When I change the import line to legate.numpy, I usually stop the code after 15 minutes of wall time. I have let it run for up to 60 minutes and it never converges.

    As a check, I've run the Numpy code with legate itself and it exactly matches the Numpy results.

    I have been experimenting with replacing the l2norm computations with numpy specific functions (np.subtract, np.square, etc.) but I have achieved no increase in performance.

    Does anyone have any recommendations?

    Thanks!

    Jeff

    (edit by Manolis: added some formatting for the code sections)

    enhancement 
    opened by laytonjbgmail 15
  • use OpenBLAS develop branch

    use OpenBLAS develop branch

    This is clearly an issue in OpenBLAS but it blocks my Legate Numpy install and is unexpected, based on my experience with OpenBLAS in other contexts.

    jhammond@nuclear:~/LEGATE/np$ python3 ./install.py --install-dir $HOME/LEGATE --with-core $HOME/LEGATE 2>&1 | tee log
    Verbose build is  off
    Legate is installing OpenBLAS into a local directory...
    Cloning into '/tmp/tmpm780ryjm'...
    Note: switching to 'd2b11c47774b9216660e76e2fc67e87079f26fa1'.
    
    You are in 'detached HEAD' state. You can look around, make experimental
    changes and commit them, and you can discard any commits you make in this
    state without impacting any branches by switching back to a branch.
    
    If you want to create a new branch to retain commits you create, you may
    do so (now or later) by using -c with the switch command. Example:
    
      git switch -c <new-branch-name>
    
    Or undo this operation with:
    
      git switch -
    
    Turn off this advice by setting config variable advice.detachedHead to false
    
    Switched to a new branch 'master'
    getarch_2nd.c: In function ‘main’:
    getarch_2nd.c:14:35: error: ‘SGEMM_DEFAULT_UNROLL_M’ undeclared (first use in this function); did you mean ‘SBGEMM_DEFAULT_UNROLL_M’?
       14 |     printf("SGEMM_UNROLL_M=%d\n", SGEMM_DEFAULT_UNROLL_M);
          |                                   ^~~~~~~~~~~~~~~~~~~~~~
          |                                   SBGEMM_DEFAULT_UNROLL_M
    getarch_2nd.c:14:35: note: each undeclared identifier is reported only once for each function it appears in
    getarch_2nd.c:15:35: error: ‘SGEMM_DEFAULT_UNROLL_N’ undeclared (first use in this function); did you mean ‘SBGEMM_DEFAULT_UNROLL_N’?
       15 |     printf("SGEMM_UNROLL_N=%d\n", SGEMM_DEFAULT_UNROLL_N);
          |                                   ^~~~~~~~~~~~~~~~~~~~~~
          |                                   SBGEMM_DEFAULT_UNROLL_N
    getarch_2nd.c:16:35: error: ‘DGEMM_DEFAULT_UNROLL_M’ undeclared (first use in this function); did you mean ‘XGEMM_DEFAULT_UNROLL_M’?
       16 |     printf("DGEMM_UNROLL_M=%d\n", DGEMM_DEFAULT_UNROLL_M);
          |                                   ^~~~~~~~~~~~~~~~~~~~~~
          |                                   XGEMM_DEFAULT_UNROLL_M
    getarch_2nd.c:17:35: error: ‘DGEMM_DEFAULT_UNROLL_N’ undeclared (first use in this function); did you mean ‘QGEMM_DEFAULT_UNROLL_N’?
       17 |     printf("DGEMM_UNROLL_N=%d\n", DGEMM_DEFAULT_UNROLL_N);
          |                                   ^~~~~~~~~~~~~~~~~~~~~~
          |                                   QGEMM_DEFAULT_UNROLL_N
    getarch_2nd.c:21:35: error: ‘CGEMM_DEFAULT_UNROLL_M’ undeclared (first use in this function); did you mean ‘XGEMM_DEFAULT_UNROLL_M’?
       21 |     printf("CGEMM_UNROLL_M=%d\n", CGEMM_DEFAULT_UNROLL_M);
          |                                   ^~~~~~~~~~~~~~~~~~~~~~
          |                                   XGEMM_DEFAULT_UNROLL_M
    getarch_2nd.c:22:35: error: ‘CGEMM_DEFAULT_UNROLL_N’ undeclared (first use in this function); did you mean ‘QGEMM_DEFAULT_UNROLL_N’?
       22 |     printf("CGEMM_UNROLL_N=%d\n", CGEMM_DEFAULT_UNROLL_N);
          |                                   ^~~~~~~~~~~~~~~~~~~~~~
          |                                   QGEMM_DEFAULT_UNROLL_N
    getarch_2nd.c:23:35: error: ‘ZGEMM_DEFAULT_UNROLL_M’ undeclared (first use in this function); did you mean ‘XGEMM_DEFAULT_UNROLL_M’?
       23 |     printf("ZGEMM_UNROLL_M=%d\n", ZGEMM_DEFAULT_UNROLL_M);
          |                                   ^~~~~~~~~~~~~~~~~~~~~~
          |                                   XGEMM_DEFAULT_UNROLL_M
    getarch_2nd.c:24:35: error: ‘ZGEMM_DEFAULT_UNROLL_N’ undeclared (first use in this function); did you mean ‘QGEMM_DEFAULT_UNROLL_N’?
       24 |     printf("ZGEMM_UNROLL_N=%d\n", ZGEMM_DEFAULT_UNROLL_N);
          |                                   ^~~~~~~~~~~~~~~~~~~~~~
          |                                   QGEMM_DEFAULT_UNROLL_N
    getarch_2nd.c:71:50: error: ‘SGEMM_DEFAULT_Q’ undeclared (first use in this function); did you mean ‘SBGEMM_DEFAULT_Q’?
       71 |     printf("#define SLOCAL_BUFFER_SIZE\t%ld\n", (SGEMM_DEFAULT_Q * SGEMM_DEFAULT_UNROLL_N * 4 * 1 *  sizeof(float)));
          |                                                  ^~~~~~~~~~~~~~~
          |                                                  SBGEMM_DEFAULT_Q
    getarch_2nd.c:72:50: error: ‘DGEMM_DEFAULT_Q’ undeclared (first use in this function); did you mean ‘SBGEMM_DEFAULT_Q’?
       72 |     printf("#define DLOCAL_BUFFER_SIZE\t%ld\n", (DGEMM_DEFAULT_Q * DGEMM_DEFAULT_UNROLL_N * 2 * 1 *  sizeof(double)));
          |                                                  ^~~~~~~~~~~~~~~
          |                                                  SBGEMM_DEFAULT_Q
    getarch_2nd.c:73:50: error: ‘CGEMM_DEFAULT_Q’ undeclared (first use in this function); did you mean ‘SBGEMM_DEFAULT_Q’?
       73 |     printf("#define CLOCAL_BUFFER_SIZE\t%ld\n", (CGEMM_DEFAULT_Q * CGEMM_DEFAULT_UNROLL_N * 4 * 2 *  sizeof(float)));
          |                                                  ^~~~~~~~~~~~~~~
          |                                                  SBGEMM_DEFAULT_Q
    getarch_2nd.c:74:50: error: ‘ZGEMM_DEFAULT_Q’ undeclared (first use in this function); did you mean ‘SBGEMM_DEFAULT_Q’?
       74 |     printf("#define ZLOCAL_BUFFER_SIZE\t%ld\n", (ZGEMM_DEFAULT_Q * ZGEMM_DEFAULT_UNROLL_N * 2 * 2 *  sizeof(double)));
          |                                                  ^~~~~~~~~~~~~~~
          |                                                  SBGEMM_DEFAULT_Q
    make: *** [Makefile.prebuild:74: getarch_2nd] Error 1
    Makefile:154: *** OpenBLAS: Detecting CPU failed. Please set TARGET explicitly, e.g. make TARGET=your_cpu_target. Please read README for the detail..  Stop.
    Traceback (most recent call last):
      File "./install.py", line 543, in <module>
        driver()
      File "./install.py", line 539, in driver
        install_legate_numpy(unknown=unknown, **vars(args))
      File "./install.py", line 359, in install_legate_numpy
        install_openblas(openblas_dir, thread_count, verbose)
      File "./install.py", line 143, in install_openblas
        execute_command(
      File "./install.py", line 62, in execute_command
        subprocess.check_call(args, cwd=cwd, shell=shell)
      File "/usr/lib/python3.8/subprocess.py", line 364, in check_call
        raise CalledProcessError(retcode, cmd)
    subprocess.CalledProcessError: Command '['make', '-j', '8', 'USE_THREAD=1', 'NO_STATIC=1', 'USE_OPENMP=1', 'NUM_PARALLEL=32', 'LIBNAMESUFFIX=legate']' returned non-zero exit status 2.
    
    opened by jeffhammond 14
  • Fix reciprocal tests for zero values and improve test value customization

    Fix reciprocal tests for zero values and improve test value customization

    Reciprocal is not valid for integers, which leads to test failures on certain platforms.

    https://numpy.org/doc/stable/reference/generated/numpy.reciprocal.html

    • Splits math ufunc tests into default operations and operations needing special values
    • Enables easier customization of input values to the different tests
    • Simplifies syntax for calling tests since all tests only take a single argument
    • Uses a hash-based seed for initializing random values so that tests use the same values whether tests are run individually or all-at-once
    opened by jjwilke 13
  • Refactor test driver for cpu/gpu sharding

    Refactor test driver for cpu/gpu sharding

    This PR add support for cpu/gpu sharding so that that it is no longer necessary to unset REALM_SYNTHETIC_CORE_MAP

    Other notes:

    • Test stages were refactored and simplified. Most control was moved to base protocol class, with stage implementations responsible only for computing the sharding spec, etc.
    • --debug now also includes explicitly modified env vars in printed output
    • --fbmem command line option was added for GPU stage
    • Some missing docs were added
    • Some minimal tests of sharding computations were added, but perhaps more is advised

    Tested locally with MPI GASNet conduit, with variations of the following invocations:

    GASNET_QUIET=1 ./test.py --use=openmp --omps=2 --ompthreads=1 --launcher=mpirun --debug
    GASNET_QUIET=1 ./test.py --use=cpus --cpus 2 --launcher=mpirun --debug 
    GASNET_QUIET=1 ./test.py --use=cuda --gpus 2 --launcher=mpirun --debug
    GASNET_QUIET=1 ./test.py --use=eager --launcher=mpirun --debug
    
    opened by bryevdv 13
  • Realm not completing gather copy on the GPU

    Realm not completing gather copy on the GPU

    Problem

    Advanced indexing of a relatively huge (e.g., length 10K) 1D array returns UnboundLocalError: local variable 'shardfn' referenced before assignment, rather than NotImplementedError.

    I understand that advanced indexing is mostly not yet implemented. Most related routines raise NotImplementedError to let users know about this situation. However, this particular use case raises this different error, which seems to be a bug to me.

    To reproduce

    1. step 1: prepare test.py:
      from legate import numpy
      a = numpy.arange(10000)
      print(a[(1, 2, 3), ]) 
      
    2. step 2: run with, for example
      $ legate --cpus 1 test.py
      

    Output

    Traceback (most recent call last):
      File "<blahblah>/lib/python3.8/site-packages/legion_top.py", line 394, in legion_python_main
        run_path(args[start], run_name='__main__')
      File "<blahblah>/lib/python3.8/site-packages/legion_top.py", line 193, in run_path
        exec(code, module.__dict__, module.__dict__)
      File "./test.py", line 3, in <module>
        print(a[(1, 2, 3), ])
      File "<blahblah>/lib/python3.8/site-packages/legate/numpy/array.py", line 381, in __getitem__
        shape=None, thunk=self._thunk.get_item(key, stacklevel=2)
      File "<blahblah>/lib/python3.8/site-packages/legate/numpy/deferred.py", line 414, in get_item
        copy = Copy(mapper=self.runtime.mapper_id, tag=shardfn)
    UnboundLocalError: local variable 'shardfn' referenced before assignment
    

    Expected results

    Either [1, 2, 3] or NotImplementedError.

    Notes

    • Interestingly, smaller arrays do not have this issue. For example, if a = numpy.arange(100), the code works fine.
    • Another way to make it works is to use GPUs instead of CPUs. For example, legate --gpus 1 test.py works fine. This is interesting, as the GPU implementation seems to be more stable than CPU implementation?
    bug in progress 
    opened by piyueh 13
  • Use pytest for test running

    Use pytest for test running

    This PR convert the existing test modules to use pytest for test discovery and running. Tests were also updated for minor cleanup and to utilize pytest features (e.g. parameterize and fixtures) effectively. The end result is to afford testing options that are more familiar and ergonomic to "standard" python devs.

    Overview

    Cleanup

    A few commits perform some minor cleanup on the existing tests:

    • https://github.com/nv-legate/cunumeric/pull/297/commits/572e7f106cffb80df2d0dcd0bb17ff423d2e2b11 — remove old and outdated "universal functions" tests
    • https://github.com/nv-legate/cunumeric/pull/297/commits/5e0ff856667234f0bff4d0ff48ab08dcf192ed7c — remove a useless test
    • https://github.com/nv-legate/cunumeric/pull/297/commits/5d8a6172caa97501e795688c22d969efbe404056 — remove redundant bare return statements throughout

    File movement

    • To make it easier to use pytest for test selection, the existing tests were moved and renamed:

    • https://github.com/nv-legate/cunumeric/pull/297/commits/3f3c5cb8d3ed05a0981990e9920d851d58dd19a1 — move files to integration subdirectory

    • https://github.com/nv-legate/cunumeric/pull/297/commits/9f4f4108449f56f360e90c2f56d30569796b7c0c — prefix all test module filenames with test_

    Moving the files to a subdirectory allows this entire group of tests to be easily run by executing pytest with this directory path. The test_ prefix is to conform with pytest expectations for default test discovery.

    Pytest updates

    • https://github.com/nv-legate/cunumeric/pull/297/commits/571ea075d549da3875550aa0d28f4154af83339d — rename some helper functions to avoid test_ prefix. By default, pytest will interpret anything starting with test or Test as test to run.
    • https://github.com/nv-legate/cunumeric/pull/297/commits/76779365e25cff1c9274b17f454145267aee51b8 — a minimal update that adds pytest scaffolding to each test module and the least change to get running. For most tests this was just a couple of lines of boilerplate change. But a few tests required slightly more extensive updates.
    • https://github.com/nv-legate/cunumeric/pull/297/commits/bbda5a96336d320ade26e34cd6359acc6da6d0b7 — more in-depth changed to split out tests into sensible smaller jobs, and to utilize parameterization and fixtures.

    Notes

    • The existing test.py module runs exactly as before. The only change was to update it for the new path to the test files ~~(see note about verbose output, though)~~
    • If desired to make this PR smaller, the final commit could be removed and submitted separately.
    • There is a legate.core issue that currently required fixtures to avoid cunumeric array re-use in tests
      • https://github.com/nv-legate/legate.core/issues/205
    • In the final commit I did try to split up tests and/or use parametrize anywhere it made sense, so that more fine-grained reporting output becomes available:
      tests/integration/test_reduction_complex.py::test_sum PASSED   [ 96%]
      tests/integration/test_reduction_complex.py::test_prod PASSED  [ 96%]
      tests/integration/test_repeat.py::test_basic PASSED            [ 96%]
      tests/integration/test_repeat.py::test_axis PASSED             [ 96%]
      tests/integration/test_repeat.py::test_nd[1] PASSED            [ 96%]
      tests/integration/test_repeat.py::test_nd[2] PASSED            [ 96%]
      tests/integration/test_repeat.py::test_nd[3] PASSED            [ 96%]
      tests/integration/test_repeat.py::test_nd[4] PASSED            [ 97%]
      

    Further notes will be inline.

    Operation

    Running test.py

    The existing way to run tests with test.py is unchanged:

    ./test.py --use=cuda 
    

    This still generates the existing report:

    image

    ~~(Note: Currently test.py -v only shows stdout from test.py and not from tests. This will be straightforward to restore but I would like to do that in a follow-on PR dedicated to test.py. Test stdout can be observed with either of the methods below with the standard -s flag. cc @magnatelee)~~

    Executing individual tests

    Individual tests can still be executed by running the module with legate including by passing in runtime command line options:

    legate tests/integration/test_nonzero.py --gpus 2 -cunumeric:test
    

    However, now the output is a standard pytest report:

    image

    It is also now possible to pass standard pytests options, e.g. -v for a verbose report:

    image

    Using pytest for test discovery

    Although it is not yet possible to simply run pytest <dir> in the typical way, it is possible to achieve the same operation with a little more explicit invocation:

    legate -c "import pytest; pytest.main(['tests/integration'])" --gpus 2 -cunumeric:test 
    

    Note that all the standard pytest options, e.g. incluing -k and -m for test filtering, can be used.

    The above will run all the tests under tests/integration, and generate a standard combined pytest report:

    image

    The full report output text can be seen here:

    legate37 ❯ legate -c "import pytest; pytest.main(['tests/integration'])" --gpus 2 -cunumeric:test 
    
    WARNING: Disabling control replication for interactive run
    ========================================================================================= test session starts ==========================================================================================
    platform linux -- Python 3.7.12, pytest-7.1.1, pluggy-1.0.0
    rootdir: /home/bryan/work/cunumeric
    collected 2860 items                                                                                                                                                                                   
    
    tests/integration/test_2d_reduction.py ...                                                                                                                                                       [  0%]
    tests/integration/test_3d_reduction.py .                                                                                                                                                         [  0%]
    tests/integration/test_advanced_indexing.py .                                                                                                                                                    [  0%]
    tests/integration/test_append.py ........                                                                                                                                                        [  0%]
    tests/integration/test_argmin.py ..                                                                                                                                                              [  0%]
    tests/integration/test_array_creation.py ..............                                                                                                                                          [  1%]
    tests/integration/test_array_split.py .......                                                                                                                                                    [  1%]
    tests/integration/test_binary_op.py .....                                                                                                                                                        [  1%]
    tests/integration/test_binary_op_2d.py ....                                                                                                                                                      [  1%]
    tests/integration/test_binary_op_broadcast.py ....                                                                                                                                               [  1%]
    tests/integration/test_binary_op_complex.py .....                                                                                                                                                [  1%]
    tests/integration/test_binary_op_typing.py ..................................................................................................................................................... [  7%]
    ................................................................................................................................................................................................ [ 13%]
    ................................................................................................................................................................................................ [ 20%]
    ................................................................................................................................................................................................ [ 27%]
    ................................................................................................................................................................................................ [ 33%]
    ................................................................................................................................................................................................ [ 40%]
    ................................................................................................................................................................................................ [ 47%]
    ................................................................................................................................................................................................ [ 54%]
    ................................................................................................................................................................................................ [ 60%]
    ................................................................................................................................................................................................ [ 67%]
    ................................................................................................................................................................................................ [ 74%]
    .................................................................................................................................................................................                [ 80%]
    tests/integration/test_binary_ufunc.py .                                                                                                                                                         [ 80%]
    tests/integration/test_bincount.py ......                                                                                                                                                        [ 80%]
    tests/integration/test_block.py ............                                                                                                                                                     [ 81%]
    tests/integration/test_cholesky.py .........                                                                                                                                                     [ 81%]
    tests/integration/test_compare.py ......                                                                                                                                                         [ 81%]
    tests/integration/test_complex_ops.py ....                                                                                                                                                       [ 81%]
    tests/integration/test_concatenate_stack.py ................................................                                                                                                     [ 83%]
    tests/integration/test_contains.py ..                                                                                                                                                            [ 83%]
    tests/integration/test_convolve.py ......                                                                                                                                                        [ 83%]
    tests/integration/test_copy.py .                                                                                                                                                                 [ 83%]
    tests/integration/test_dot.py .........................                                                                                                                                          [ 84%]
    tests/integration/test_einsum.py .......................................................................................................................................................         [ 89%]
    tests/integration/test_eye.py ...............                                                                                                                                                    [ 90%]
    tests/integration/test_fill.py .                                                                                                                                                                 [ 90%]
    tests/integration/test_flatten.py .........                                                                                                                                                      [ 90%]
    tests/integration/test_flip.py ..........                                                                                                                                                        [ 91%]
    tests/integration/test_get_item.py .                                                                                                                                                             [ 91%]
    tests/integration/test_index_routines.py ............                                                                                                                                            [ 91%]
    tests/integration/test_ingest.py ....                                                                                                                                                            [ 91%]
    tests/integration/test_inlinemap-keeps-region-alive.py .                                                                                                                                         [ 91%]
    tests/integration/test_inner.py .........................                                                                                                                                        [ 92%]
    tests/integration/test_interop.py .                                                                                                                                                              [ 92%]
    tests/integration/test_intra_array_copy.py ....                                                                                                                                                  [ 92%]
    tests/integration/test_jacobi.py .                                                                                                                                                               [ 92%]
    tests/integration/test_length.py ....                                                                                                                                                            [ 92%]
    tests/integration/test_linspace.py ....                                                                                                                                                          [ 93%]
    tests/integration/test_logical.py .....s                                                                                                                                                         [ 93%]
    tests/integration/test_lstm_backward_test.py .                                                                                                                                                   [ 93%]
    tests/integration/test_lstm_simple_forward.py .                                                                                                                                                  [ 93%]
    tests/integration/test_map_reduce.py .                                                                                                                                                           [ 93%]
    tests/integration/test_mask.py ....                                                                                                                                                              [ 93%]
    tests/integration/test_matmul.py ................                                                                                                                                                [ 94%]
    tests/integration/test_nonzero.py ........                                                                                                                                                       [ 94%]
    tests/integration/test_norm.py .                                                                                                                                                                 [ 94%]
    tests/integration/test_numpy_interop.py .                                                                                                                                                        [ 94%]
    tests/integration/test_outer.py .................                                                                                                                                                [ 95%]
    tests/integration/test_overwrite_slice.py .                                                                                                                                                      [ 95%]
    tests/integration/test_randint.py ..                                                                                                                                                             [ 95%]
    tests/integration/test_reduction.py .......                                                                                                                                                      [ 95%]
    tests/integration/test_reduction_axis.py ..................                                                                                                                                      [ 96%]
    tests/integration/test_reduction_complex.py ..                                                                                                                                                   [ 96%]
    tests/integration/test_repeat.py ......                                                                                                                                                          [ 96%]
    tests/integration/test_reshape.py .................                                                                                                                                              [ 96%]
    tests/integration/test_set_item.py .                                                                                                                                                             [ 96%]
    tests/integration/test_shape.py .s                                                                                                                                                               [ 97%]
    tests/integration/test_singleton_access.py .                                                                                                                                                     [ 97%]
    tests/integration/test_slicing.py ..ss                                                                                                                                                           [ 97%]
    tests/integration/test_sort.py .                                                                                                                                                                 [ 97%]
    tests/integration/test_squeeze.py .                                                                                                                                                              [ 97%]
    tests/integration/test_swapaxes.py ....                                                                                                                                                          [ 97%]
    tests/integration/test_tensordot.py .........................                                                                                                                                    [ 98%]
    tests/integration/test_tile.py ..                                                                                                                                                                [ 98%]
    tests/integration/test_transpose.py ..                                                                                                                                                           [ 98%]
    tests/integration/test_trilu.py ..........                                                                                                                                                       [ 98%]
    tests/integration/test_unary_functions_2d.py ........                                                                                                                                            [ 99%]
    tests/integration/test_unary_functions_2d_complex.py .....                                                                                                                                       [ 99%]
    tests/integration/test_unary_ufunc.py .                                                                                                                                                          [ 99%]
    tests/integration/test_unique.py .....                                                                                                                                                           [ 99%]
    tests/integration/test_update.py .                                                                                                                                                               [ 99%]
    tests/integration/test_vdot.py ....                                                                                                                                                              [ 99%]
    tests/integration/test_view.py .                                                                                                                                                                 [ 99%]
    tests/integration/test_vstack.py ...                                                                                                                                                             [ 99%]
    tests/integration/test_where.py .....s                                                                                                                                                           [ 99%]
    tests/integration/test_window.py .                                                                                                                                                               [100%]
    
    =========================================================================================== warnings summary ===========================================================================================
    tests/integration/test_advanced_indexing.py: 124 warnings
      /home/bryan/work/cunumeric/cunumeric/array.py:773: DeprecationWarning: `np.bool` is a deprecated alias for the builtin `bool`. To silence this warning, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here.
      Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
        if key.dtype != np.bool and not np.issubdtype(
    
    tests/integration/test_advanced_indexing.py: 124 warnings
      /home/bryan/work/cunumeric/cunumeric/array.py:777: DeprecationWarning: `np.bool` is a deprecated alias for the builtin `bool`. To silence this warning, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here.
      Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
        if key.dtype != np.bool and key.dtype != np.int64:
    
    tests/integration/test_advanced_indexing.py: 30 warnings
      /home/bryan/work/cunumeric/cunumeric/deferred.py:425: DeprecationWarning: `np.bool` is a deprecated alias for the builtin `bool`. To silence this warning, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here.
      Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
        and key.dtype == np.bool
    
    tests/integration/test_advanced_indexing.py: 120 warnings
      /home/bryan/work/cunumeric/cunumeric/deferred.py:479: DeprecationWarning: `np.bool` is a deprecated alias for the builtin `bool`. To silence this warning, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here.
      Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
        and k.dtype == np.bool
    
    tests/integration/test_advanced_indexing.py: 120 warnings
      /home/bryan/work/cunumeric/cunumeric/deferred.py:516: DeprecationWarning: `np.bool` is a deprecated alias for the builtin `bool`. To silence this warning, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here.
      Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
        if k.dtype == np.bool:
    
    tests/integration/test_advanced_indexing.py::test
      /home/bryan/work/cunumeric/tests/integration/test_advanced_indexing.py:118: RuntimeWarning: converting index array to int64 type
        assert np.array_equal(x[index], x_num[index_num])
    
    tests/integration/test_advanced_indexing.py::test
      /home/bryan/work/cunumeric/tests/integration/test_advanced_indexing.py:122: RuntimeWarning: converting index array to int64 type
        x_num[index_num] = 3.5
    
    tests/integration/test_advanced_indexing.py::test
      /home/bryan/work/cunumeric/tests/integration/test_advanced_indexing.py:129: UserWarning: cuNumeric performing implicit type conversion from float64 to int64
        x_num[index_num] = b_num
    
    tests/integration/test_advanced_indexing.py::test
      /home/bryan/work/cunumeric/tests/integration/test_advanced_indexing.py:129: RuntimeWarning: converting index array to int64 type
        x_num[index_num] = b_num
    
    tests/integration/test_advanced_indexing.py::test
      /home/bryan/work/cunumeric/tests/integration/test_advanced_indexing.py:559: UserWarning: cuNumeric performing implicit type conversion from float64 to int64
        z_num[indx_num] = b_num
    
    tests/integration/test_advanced_indexing.py::test
      /home/bryan/work/cunumeric/tests/integration/test_advanced_indexing.py:609: RuntimeWarning: converting index array to int64 type
        res_num = x_num[ind_num, ind_num]
    
    tests/integration/test_advanced_indexing.py::test
      /home/bryan/work/cunumeric/tests/integration/test_advanced_indexing.py:627: UserWarning: cuNumeric performing implicit type conversion from int16 to float64
        x_num[ind_num, ind_num] = b_num
    
    tests/integration/test_cholesky.py::test_complex[8]
    tests/integration/test_cholesky.py::test_complex[9]
    tests/integration/test_cholesky.py::test_complex[255]
    tests/integration/test_cholesky.py::test_complex[512]
      /home/bryan/work/cunumeric/tests/integration/test_cholesky.py:48: UserWarning: cuNumeric performing implicit type conversion from complex128 to float64
        d[1] = b
    
    tests/integration/test_dot.py::test_dot[1-1]
    tests/integration/test_dot.py::test_dot[1-2]
    tests/integration/test_dot.py::test_dot[2-1]
    tests/integration/test_dot.py::test_dot[2-2]
      /home/bryan/work/cunumeric/tests/integration/test_dot.py:34: UserWarning: cuNumeric performing implicit type conversion from float16 to float32
        return lib.dot(*args, **kwargs)
    
    tests/integration/test_einsum.py::test_cast[->]
    tests/integration/test_einsum.py::test_cast[a->]
    tests/integration/test_einsum.py::test_cast[a,->]
    tests/integration/test_einsum.py::test_cast[a,a->]
      /home/bryan/work/cunumeric/cunumeric/array.py:123: ComplexWarning: Casting complex values to real discards the imaginary part
        *args, **kwargs
    
    tests/integration/test_einsum.py::test_cast[a->a]
    tests/integration/test_einsum.py::test_cast[a,->a]
    tests/integration/test_einsum.py::test_cast[a,a->]
    tests/integration/test_einsum.py::test_cast[a,a->a]
    tests/integration/test_einsum.py::test_cast[a,b->ab]
    tests/integration/test_einsum.py::test_cast[ab,ca->a]
    tests/integration/test_einsum.py::test_cast[ab,ca->b]
      /home/bryan/work/cunumeric/tests/integration/test_einsum.py:235: UserWarning: cuNumeric performing implicit type conversion from float16 to float32
        cn.einsum(expr, *cn_inputs, out=cn_out)
    
    tests/integration/test_einsum.py::test_cast[a->a]
    tests/integration/test_einsum.py::test_cast[a,->a]
    tests/integration/test_einsum.py::test_cast[a,a->]
    tests/integration/test_einsum.py::test_cast[a,a->a]
    tests/integration/test_einsum.py::test_cast[a,b->ab]
    tests/integration/test_einsum.py::test_cast[ab,ca->a]
    tests/integration/test_einsum.py::test_cast[ab,ca->b]
      /home/bryan/work/cunumeric/tests/integration/test_einsum.py:235: UserWarning: cuNumeric performing implicit type conversion from float16 to complex64
        cn.einsum(expr, *cn_inputs, out=cn_out)
    
    tests/integration/test_einsum.py::test_cast[a->a]
    tests/integration/test_einsum.py::test_cast[a,->a]
    tests/integration/test_einsum.py::test_cast[a,a->a]
    tests/integration/test_einsum.py::test_cast[a,b->ab]
    tests/integration/test_einsum.py::test_cast[ab,ca->a]
    tests/integration/test_einsum.py::test_cast[ab,ca->b]
      /home/bryan/work/cunumeric/tests/integration/test_einsum.py:235: UserWarning: cuNumeric performing implicit type conversion from float32 to float16
        cn.einsum(expr, *cn_inputs, out=cn_out)
    
    tests/integration/test_einsum.py::test_cast[a->a]
    tests/integration/test_einsum.py::test_cast[a,->a]
    tests/integration/test_einsum.py::test_cast[a,a->]
    tests/integration/test_einsum.py::test_cast[a,a->a]
    tests/integration/test_einsum.py::test_cast[a,b->ab]
    tests/integration/test_einsum.py::test_cast[ab,ca->a]
    tests/integration/test_einsum.py::test_cast[ab,ca->b]
      /home/bryan/work/cunumeric/tests/integration/test_einsum.py:235: UserWarning: cuNumeric performing implicit type conversion from float32 to complex64
        cn.einsum(expr, *cn_inputs, out=cn_out)
    
    tests/integration/test_einsum.py::test_cast[a->a]
    tests/integration/test_einsum.py::test_cast[a,->a]
    tests/integration/test_einsum.py::test_cast[a,a->a]
    tests/integration/test_einsum.py::test_cast[a,b->ab]
    tests/integration/test_einsum.py::test_cast[ab,ca->a]
    tests/integration/test_einsum.py::test_cast[ab,ca->b]
      /home/bryan/work/cunumeric/tests/integration/test_einsum.py:235: UserWarning: cuNumeric performing implicit type conversion from complex64 to float16
        cn.einsum(expr, *cn_inputs, out=cn_out)
    
    tests/integration/test_einsum.py::test_cast[a->a]
    tests/integration/test_einsum.py::test_cast[a,->a]
    tests/integration/test_einsum.py::test_cast[a,a->a]
    tests/integration/test_einsum.py::test_cast[a,b->ab]
    tests/integration/test_einsum.py::test_cast[ab,ca->a]
    tests/integration/test_einsum.py::test_cast[ab,ca->b]
      /home/bryan/work/cunumeric/tests/integration/test_einsum.py:235: UserWarning: cuNumeric performing implicit type conversion from complex64 to float32
        cn.einsum(expr, *cn_inputs, out=cn_out)
    
    tests/integration/test_einsum.py::test_cast[a,a->]
    tests/integration/test_einsum.py::test_cast[a,a->a]
    tests/integration/test_einsum.py::test_cast[a,b->ab]
    tests/integration/test_einsum.py::test_cast[ab,ca->a]
    tests/integration/test_einsum.py::test_cast[ab,ca->b]
      /home/bryan/work/cunumeric/tests/integration/test_einsum.py:228: UserWarning: cuNumeric performing implicit type conversion from float16 to float32
        cn_res = cn.einsum(expr, *cn_inputs)
    
    tests/integration/test_einsum.py::test_cast[a,a->]
    tests/integration/test_einsum.py::test_cast[a,a->a]
    tests/integration/test_einsum.py::test_cast[a,b->ab]
    tests/integration/test_einsum.py::test_cast[ab,ca->a]
    tests/integration/test_einsum.py::test_cast[ab,ca->b]
      /home/bryan/work/cunumeric/tests/integration/test_einsum.py:228: UserWarning: cuNumeric performing implicit type conversion from float16 to complex64
        cn_res = cn.einsum(expr, *cn_inputs)
    
    tests/integration/test_einsum.py::test_cast[a,a->]
    tests/integration/test_einsum.py::test_cast[a,a->a]
    tests/integration/test_einsum.py::test_cast[a,b->ab]
    tests/integration/test_einsum.py::test_cast[ab,ca->a]
    tests/integration/test_einsum.py::test_cast[ab,ca->b]
      /home/bryan/work/cunumeric/tests/integration/test_einsum.py:228: UserWarning: cuNumeric performing implicit type conversion from float32 to complex64
        cn_res = cn.einsum(expr, *cn_inputs)
    
    tests/integration/test_flatten.py::test_basic[(1, 1)]
    tests/integration/test_flatten.py::test_basic[(1, 1, 1)]
    tests/integration/test_flatten.py::test_basic[(1, 10)]
    tests/integration/test_flatten.py::test_basic[(1, 10, 1)]
    tests/integration/test_flatten.py::test_basic[(10, 10)]
    tests/integration/test_flatten.py::test_basic[(10, 10, 10)]
      /home/bryan/work/cunumeric/tests/integration/test_flatten.py:26: RuntimeWarning: cuNumeric has not implemented reshape using Fortran-like index order and is falling back to canonical numpy. You may notice significantly decreased performance for this function call.
        c = num_arr.flatten(order)
    
    tests/integration/test_index_routines.py::test_choose_1d
      /home/bryan/work/cunumeric/tests/integration/test_index_routines.py:42: UserWarning: cuNumeric performing implicit type conversion from int64 to float64
        assert np.array_equal(
    
    tests/integration/test_sort.py::test
      /home/bryan/work/cunumeric/tests/integration/test_sort.py:23: UserWarning: cuNumeric performing implicit type conversion from complex64 to complex128
        if not num.allclose(a_np, a_num):
    
    tests/integration/test_vdot.py::test[complex64-complex128]
    tests/integration/test_vdot.py::test[complex128-complex64]
      /home/bryan/work/cunumeric/tests/integration/test_vdot.py:28: UserWarning: cuNumeric performing implicit type conversion from complex64 to complex128
        mk_0to1_array(lib, (5,), dtype=b_dtype),
    
    -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
    ============================================================================ 2855 passed, 5 skipped, 601 warnings in 41.34s ============================================================================
    

    There are some warning that show up in the output at the end, mostly about implicit casting. I am not sure if these are expected or not (I expect they would be)

    Future work

    Things that are not in this PR but are planned for follow-on PRs:

    • Using pytest to run the examples
    • ~~Restoring previous -v behaviour to test.py~~
    • Increased documentation for running tests
    • Allowing to run with simpler legate -m pytest
    opened by bryevdv 12
  • Initial unit tests

    Initial unit tests

    This PR adds some "basic" unit tests to a cubset of cunumeric modules:

    • cunumeric.coverage
    • cunumeric.patch
    • cunumeric.utils

    Currently, these tests may be run manually by executing the command

    legate -c "import pytest; pytest.main(['tests/unit', '-v'])"
    

    which will result in output similar to:

    WARNING: Disabling control replication for interactive run
    ======================================================== test session starts ========================================================
    platform linux -- Python 3.7.12, pytest-7.1.1, pluggy-1.0.0 -- /home/bryan/anaconda3/envs/legate37/bin/python3
    cachedir: .pytest_cache
    rootdir: /home/bryan/work/cunumeric
    collected 66 items                                                                                                                  
    
    tests/unit/cunumeric/test_coverage.py::test_FALLBACK_WARNING PASSED                                                           [  1%]
    tests/unit/cunumeric/test_coverage.py::test_MOD_INTERNAL PASSED                                                               [  3%]
    tests/unit/cunumeric/test_coverage.py::test_NDARRAY_INTERNAL PASSED                                                           [  4%]
    tests/unit/cunumeric/test_coverage.py::Test_filter_namespace::test_empty PASSED                                               [  6%]
    tests/unit/cunumeric/test_coverage.py::Test_filter_namespace::test_no_filters PASSED                                          [  7%]
    tests/unit/cunumeric/test_coverage.py::Test_filter_namespace::test_name_filters PASSED                                        [  9%]
    tests/unit/cunumeric/test_coverage.py::Test_filter_namespace::test_type_filters PASSED                                        [ 10%]
    tests/unit/cunumeric/test_coverage.py::test_implemented PASSED                                                                [ 12%]
    tests/unit/cunumeric/test_coverage.py::Test_unimplemented::test_reporting_True PASSED                                         [ 13%]
    tests/unit/cunumeric/test_coverage.py::Test_unimplemented::test_reporting_False PASSED                                        [ 15%]
    tests/unit/cunumeric/test_coverage.py::Test_clone_module::test_report_coverage_True PASSED                                    [ 16%]
    tests/unit/cunumeric/test_coverage.py::Test_clone_module::test_report_coverage_False PASSED                                   [ 18%]
    tests/unit/cunumeric/test_coverage.py::Test_clone_class::test_report_coverage_True PASSED                                     [ 19%]
    tests/unit/cunumeric/test_coverage.py::Test_clone_class::test_report_coverage_False PASSED                                    [ 21%]
    tests/unit/cunumeric/test_patch.py::test_no_patch PASSED                                                                      [ 22%]
    tests/unit/cunumeric/test_patch.py::test_patch PASSED                                                                         [ 24%]
    tests/unit/cunumeric/test_utils.py::test_find_last_user_stacklevel PASSED                                                     [ 25%]
    tests/unit/cunumeric/test_utils.py::test_get_line_number_from_frame PASSED                                                    [ 27%]
    tests/unit/cunumeric/test_utils.py::Test_find_last_user_frames::test_default_top_only PASSED                                  [ 28%]
    tests/unit/cunumeric/test_utils.py::Test_find_last_user_frames::test_top_only_True PASSED                                     [ 30%]
    tests/unit/cunumeric/test_utils.py::Test_find_last_user_frames::test_top_only_False PASSED                                    [ 31%]
    tests/unit/cunumeric/test_utils.py::Test_is_supported_dtype::test_type_bad[foo] PASSED                                        [ 33%]
    tests/unit/cunumeric/test_utils.py::Test_is_supported_dtype::test_type_bad[10] PASSED                                         [ 34%]
    tests/unit/cunumeric/test_utils.py::Test_is_supported_dtype::test_type_bad[10.2] PASSED                                       [ 36%]
    tests/unit/cunumeric/test_utils.py::Test_is_supported_dtype::test_type_bad[value3] PASSED                                     [ 37%]
    tests/unit/cunumeric/test_utils.py::Test_is_supported_dtype::test_type_bad[value4] PASSED                                     [ 39%]
    tests/unit/cunumeric/test_utils.py::Test_is_supported_dtype::test_type_bad[value5] PASSED                                     [ 40%]
    tests/unit/cunumeric/test_utils.py::Test_is_supported_dtype::test_type_bad[value6] PASSED                                     [ 42%]
    tests/unit/cunumeric/test_utils.py::Test_is_supported_dtype::test_type_bad[None] PASSED                                       [ 43%]
    tests/unit/cunumeric/test_utils.py::Test_is_supported_dtype::test_supported[float16] PASSED                                   [ 45%]
    tests/unit/cunumeric/test_utils.py::Test_is_supported_dtype::test_supported[float32] PASSED                                   [ 46%]
    tests/unit/cunumeric/test_utils.py::Test_is_supported_dtype::test_supported[float64] PASSED                                   [ 48%]
    tests/unit/cunumeric/test_utils.py::Test_is_supported_dtype::test_supported[float] PASSED                                     [ 50%]
    tests/unit/cunumeric/test_utils.py::Test_is_supported_dtype::test_supported[int16] PASSED                                     [ 51%]
    tests/unit/cunumeric/test_utils.py::Test_is_supported_dtype::test_supported[int32] PASSED                                     [ 53%]
    tests/unit/cunumeric/test_utils.py::Test_is_supported_dtype::test_supported[int64] PASSED                                     [ 54%]
    tests/unit/cunumeric/test_utils.py::Test_is_supported_dtype::test_supported[int] PASSED                                       [ 56%]
    tests/unit/cunumeric/test_utils.py::Test_is_supported_dtype::test_supported[uint16] PASSED                                    [ 57%]
    tests/unit/cunumeric/test_utils.py::Test_is_supported_dtype::test_supported[uint32] PASSED                                    [ 59%]
    tests/unit/cunumeric/test_utils.py::Test_is_supported_dtype::test_supported[uint64] PASSED                                    [ 60%]
    tests/unit/cunumeric/test_utils.py::Test_is_supported_dtype::test_supported[bool_] PASSED                                     [ 62%]
    tests/unit/cunumeric/test_utils.py::Test_is_supported_dtype::test_supported[bool] PASSED                                      [ 63%]
    tests/unit/cunumeric/test_utils.py::Test_is_supported_dtype::test_unsupported[float128] PASSED                                [ 65%]
    tests/unit/cunumeric/test_utils.py::Test_is_supported_dtype::test_unsupported[complex64] PASSED                               [ 66%]
    tests/unit/cunumeric/test_utils.py::Test_is_supported_dtype::test_unsupported[datetime64] PASSED                              [ 68%]
    tests/unit/cunumeric/test_utils.py::test_calculate_volume[shape0-0] PASSED                                                    [ 69%]
    tests/unit/cunumeric/test_utils.py::test_calculate_volume[shape1-10] PASSED                                                   [ 71%]
    tests/unit/cunumeric/test_utils.py::test_calculate_volume[shape2-6] PASSED                                                    [ 72%]
    tests/unit/cunumeric/test_utils.py::test_get_arg_dtype PASSED                                                                 [ 74%]
    tests/unit/cunumeric/test_utils.py::test_get_arg_value_dtype PASSED                                                           [ 75%]
    tests/unit/cunumeric/test_utils.py::test_dot_modes PASSED                                                                     [ 77%]
    tests/unit/cunumeric/test_utils.py::test_inner_modes PASSED                                                                   [ 78%]
    tests/unit/cunumeric/test_utils.py::test_matmul_modes_bad[0-0] PASSED                                                         [ 80%]
    tests/unit/cunumeric/test_utils.py::test_matmul_modes_bad[0-1] PASSED                                                         [ 81%]
    tests/unit/cunumeric/test_utils.py::test_matmul_modes_bad[1-0] PASSED                                                         [ 83%]
    tests/unit/cunumeric/test_utils.py::test_matmul_modes PASSED                                                                  [ 84%]
    tests/unit/cunumeric/test_utils.py::Test_tensordot_modes::test_bad_single_axis[1-3-2] PASSED                                  [ 86%]
    tests/unit/cunumeric/test_utils.py::Test_tensordot_modes::test_bad_single_axis[3-1-2] PASSED                                  [ 87%]
    tests/unit/cunumeric/test_utils.py::Test_tensordot_modes::test_bad_single_axis[1-1-2] PASSED                                  [ 89%]
    tests/unit/cunumeric/test_utils.py::Test_tensordot_modes::test_bad_axes_length PASSED                                         [ 90%]
    tests/unit/cunumeric/test_utils.py::Test_tensordot_modes::test_bad_negative_axes PASSED                                       [ 92%]
    tests/unit/cunumeric/test_utils.py::Test_tensordot_modes::test_bad_mismatched_axes PASSED                                     [ 93%]
    tests/unit/cunumeric/test_utils.py::Test_tensordot_modes::test_bad_axes_oob PASSED                                            [ 95%]
    tests/unit/cunumeric/test_utils.py::Test_tensordot_modes::test_single_axis PASSED                                             [ 96%]
    tests/unit/cunumeric/test_utils.py::Test_tensordot_modes::test_tuple_axis PASSED                                              [ 98%]
    tests/unit/cunumeric/test_utils.py::Test_tensordot_modes::test_explicit_axis PASSED                                           [100%]
    
    ======================================================== 66 passed in 2.71s =========================================================
    
    
    

    Notes

    • The mock packages was added to the conda environment file. It needs to be installed (manually in an existing env, otherwise) to run the tests.
    • I haven't thought about contracting differential forms in ~20 years, so the product mode tests record a snapshot of current behavior as-is, while trying to cover error paths explicitly. I did not separately try to verify correctness.
    • Slots were removed from Runtime. They were interfering with the ability to mock Runtime methods and attributes, but also are not really appropriate in this situation. The intended purpose of __slots__ is a space-optimization in the case of many small objects.
    • I added a flag to the wrappers returned by implemented / unimplemented just to greatly streamline testing.
    • The last deprecated uses of np.bool etc, were removed.
    opened by bryevdv 10
  • Hang during destroying of the interpreter

    Hang during destroying of the interpreter

    This bug happens in the conda environment (confirmed on at least 2 different machines):

    [0 - 7ffec1256700]    7.756864 {2}{python}: destroying interpreter
    ^C
    Thread 1 "legion_python" received signal SIGINT, Interrupt.
    syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
    38	../sysdeps/unix/sysv/linux/x86_64/syscall.S: No such file or directory.
    (gdb) info threads
      Id   Target Id         Frame
    * 1    Thread 0x7ffff11c6000 (LWP 24267) "legion_python" syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
      2    Thread 0x7fffeb062700 (LWP 24293) "legion_python" 0x00007ffff36450c7 in accept4 (fd=29, addr=..., addr_len=0x7fffeb05ebb4, flags=524288) at ../sysdeps/unix/sysv/linux/accept4.c:32
      3    Thread 0x7fffea761700 (LWP 24295) "legion_python" syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
      4    Thread 0x7fffea65b700 (LWP 24296) "legion_python" syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
      9    Thread 0x7ffec1fff700 (LWP 24301) "jemalloc_bg_thd" 0x00007ffff13daad3 in futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x7ffec260a5f4) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
      10   Thread 0x7ffec1256700 (LWP 24302) "legion_python" 0x00007ffff13dd7c6 in futex_abstimed_wait_cancelable (private=0, abstime=0x0, expected=0, futex_word=0x7ffecc070b10) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
    (gdb) bt
    #0  syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
    #1  0x00007ffff3c37317 in Realm::Doorbell::wait_slow (this=0x7ffff11c5f30) at /gpfs/fs1/mzalewski/.pyenv/versions/anaconda3-2021.11/envs/legate-dev/x86_64-conda-linux-gnu/include/c++/11.2.0/timers.inl:241
    #2  0x00007ffff3af145f in Realm::Doorbell::wait (this=0x7ffff11c5f30) at /gpfs/fs1/mzalewski/.pyenv/versions/anaconda3-2021.11/envs/legate-dev/x86_64-conda-linux-gnu/include/c++/11.2.0/utils.h:81
    #3  0x00007ffff3c37f8d in Realm::UnfairCondVar::wait (this=0x555555662130) at /gpfs/fs1/mzalewski/.pyenv/versions/anaconda3-2021.11/envs/legate-dev/x86_64-conda-linux-gnu/include/c++/11.2.0/timers.inl:874
    #4  0x00007ffff3c63a6e in Realm::KernelThreadTaskScheduler::shutdown (this=0x555555661e10) at /gpfs/fs1/mzalewski/.pyenv/versions/anaconda3-2021.11/envs/legate-dev/x86_64-conda-linux-gnu/include/c++/11.2.0/runtime_impl.h:1291
    #5  0x00007ffff3fc6ca9 in Realm::LocalPythonProcessor::shutdown (this=0x555555661bf0) at /gpfs/fs1/mzalewski/.pyenv/versions/anaconda3-2021.11/envs/legate-dev/x86_64-conda-linux-gnu/include/c++/11.2.0/bits/runtime_impl.h:736
    #6  0x00007ffff3abe104 in Realm::RuntimeImpl::wait_for_shutdown (this=0x555555607860) at /gpfs/fs1/mzalewski/.pyenv/versions/anaconda3-2021.11/envs/legate-dev/x86_64-conda-linux-gnu/include/c++/11.2.0/runtime_impl.h:2310
    #7  0x00007ffff3ab63fe in Realm::Runtime::wait_for_shutdown (this=0x7fffffff92b8) at /gpfs/fs1/mzalewski/.pyenv/versions/anaconda3-2021.11/envs/legate-dev/x86_64-conda-linux-gnu/include/c++/11.2.0/runtime_impl.h:636
    #8  0x00007ffff6cbe2a3 in Legion::Internal::Runtime::start (argc=3, argv=0x7fffffff9478, background=false, supply_default_mapper=true) at /gpfs/fs1/mzalewski/.pyenv/versions/anaconda3-2021.11/envs/legate-dev/x86_64-conda-linux-gnu/include/c++/11.2.0/cmdline.inl:29262
    #9  0x00007ffff666e257 in Legion::Runtime::start (argc=3, argv=0x7fffffff9478, background=false, default_mapper=true) at /gpfs/fs1/mzalewski/repos/quickstart-collection/legate.core/legion/runtime/realm/legion_context.h:7371
    #10 0x00005555555961bb in main (argc=3, argv=0x7fffffff9478) at /gpfs/fs1/mzalewski/repos/quickstart-collection/legate.core/legion/runtime/realm/legion_mapping.inl:217
    (gdb) thread 2
    [Switching to thread 2 (Thread 0x7fffeb062700 (LWP 24293))]
    #0  0x00007ffff36450c7 in accept4 (fd=29, addr=..., addr_len=0x7fffeb05ebb4, flags=524288) at ../sysdeps/unix/sysv/linux/accept4.c:32
    32	../sysdeps/unix/sysv/linux/accept4.c: No such file or directory.
    (gdb) bt
    #0  0x00007ffff36450c7 in accept4 (fd=29, addr=..., addr_len=0x7fffeb05ebb4, flags=524288) at ../sysdeps/unix/sysv/linux/accept4.c:32
    #1  0x00007ffff1d76cd3 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
    #2  0x00007ffff1e18bd6 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
    #3  0x00007ffff13d46db in start_thread (arg=0x7fffeb062700) at pthread_create.c:463
    #4  0x00007ffff364371f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
    (gdb) thread 3
    [Switching to thread 3 (Thread 0x7fffea761700 (LWP 24295))]
    #0  syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
    38	../sysdeps/unix/sysv/linux/x86_64/syscall.S: No such file or directory.
    (gdb) bt
    #0  syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
    #1  0x00007ffff3c37317 in Realm::Doorbell::wait_slow (this=0x7fffea761630) at /gpfs/fs1/mzalewski/.pyenv/versions/anaconda3-2021.11/envs/legate-dev/x86_64-conda-linux-gnu/include/c++/11.2.0/timers.inl:241
    #2  0x00007ffff3af145f in Realm::Doorbell::wait (this=0x7fffea761630) at /gpfs/fs1/mzalewski/.pyenv/versions/anaconda3-2021.11/envs/legate-dev/x86_64-conda-linux-gnu/include/c++/11.2.0/utils.h:81
    #3  0x00007ffff3aeef74 in Realm::BackgroundWorkThread::main_loop (this=0x55555566dc00) at /gpfs/fs1/mzalewski/.pyenv/versions/anaconda3-2021.11/envs/legate-dev/x86_64-conda-linux-gnu/include/c++/11.2.0/timers.inl:166
    #4  0x00007ffff3af2a26 in Realm::Thread::thread_entry_wrapper<Realm::BackgroundWorkThread, &Realm::BackgroundWorkThread::main_loop> (obj=0x55555566dc00) at /gpfs/fs1/mzalewski/.pyenv/versions/anaconda3-2021.11/envs/legate-dev/x86_64-conda-linux-gnu/include/c++/11.2.0/mutex.inl:97
    #5  0x00007ffff3c3d9a8 in Realm::KernelThread::pthread_entry (data=0x55555566df90) at /gpfs/fs1/mzalewski/repos/quickstart-collection/legate.core/legion/bindings/python/stl_map.h:774
    #6  0x00007ffff13d46db in start_thread (arg=0x7fffea761700) at pthread_create.c:463
    #7  0x00007ffff364371f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
    (gdb) thread 4
    [Switching to thread 4 (Thread 0x7fffea65b700 (LWP 24296))]
    #0  syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
    38	in ../sysdeps/unix/sysv/linux/x86_64/syscall.S
    (gdb) bt
    #0  syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
    #1  0x00007ffff3c37317 in Realm::Doorbell::wait_slow (this=0x7fffea65b630) at /gpfs/fs1/mzalewski/.pyenv/versions/anaconda3-2021.11/envs/legate-dev/x86_64-conda-linux-gnu/include/c++/11.2.0/timers.inl:241
    #2  0x00007ffff3af145f in Realm::Doorbell::wait (this=0x7fffea65b630) at /gpfs/fs1/mzalewski/.pyenv/versions/anaconda3-2021.11/envs/legate-dev/x86_64-conda-linux-gnu/include/c++/11.2.0/utils.h:81
    #3  0x00007ffff3aeef74 in Realm::BackgroundWorkThread::main_loop (this=0x5555556762f0) at /gpfs/fs1/mzalewski/.pyenv/versions/anaconda3-2021.11/envs/legate-dev/x86_64-conda-linux-gnu/include/c++/11.2.0/timers.inl:166
    #4  0x00007ffff3af2a26 in Realm::Thread::thread_entry_wrapper<Realm::BackgroundWorkThread, &Realm::BackgroundWorkThread::main_loop> (obj=0x5555556762f0) at /gpfs/fs1/mzalewski/.pyenv/versions/anaconda3-2021.11/envs/legate-dev/x86_64-conda-linux-gnu/include/c++/11.2.0/mutex.inl:97
    #5  0x00007ffff3c3d9a8 in Realm::KernelThread::pthread_entry (data=0x55555566e6b0) at /gpfs/fs1/mzalewski/repos/quickstart-collection/legate.core/legion/bindings/python/stl_map.h:774
    #6  0x00007ffff13d46db in start_thread (arg=0x7fffea65b700) at pthread_create.c:463
    #7  0x00007ffff364371f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
    (gdb) thread 9
    [Switching to thread 9 (Thread 0x7ffec1fff700 (LWP 24301))]
    #0  0x00007ffff13daad3 in futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x7ffec260a5f4) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
    88	../sysdeps/unix/sysv/linux/futex-internal.h: No such file or directory.
    (gdb) bt
    #0  0x00007ffff13daad3 in futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x7ffec260a5f4) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
    #1  __pthread_cond_wait_common (abstime=0x0, mutex=0x7ffec260a638, cond=0x7ffec260a5c8) at pthread_cond_wait.c:502
    #2  __pthread_cond_wait (cond=0x7ffec260a5c8, mutex=0x7ffec260a638) at pthread_cond_wait.c:655
    #3  0x00007ffec3a9ef6b in background_thread_sleep (tsdn=<optimized out>, interval=<optimized out>, info=<optimized out>) at src/background_thread.c:232
    #4  background_work_sleep_once (ind=0, info=<optimized out>, tsdn=<optimized out>) at src/background_thread.c:307
    #5  background_thread0_work (tsd=<optimized out>) at src/background_thread.c:452
    #6  background_work (ind=<optimized out>, tsd=<optimized out>) at src/background_thread.c:490
    #7  background_thread_entry () at src/background_thread.c:522
    #8  0x00007ffff13d46db in start_thread (arg=0x7ffec1fff700) at pthread_create.c:463
    #9  0x00007ffff364371f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
    (gdb) thread 10
    [Switching to thread 10 (Thread 0x7ffec1256700 (LWP 24302))]
    #0  0x00007ffff13dd7c6 in futex_abstimed_wait_cancelable (private=0, abstime=0x0, expected=0, futex_word=0x7ffecc070b10) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
    205	../sysdeps/unix/sysv/linux/futex-internal.h: No such file or directory.
    (gdb) bt
    #0  0x00007ffff13dd7c6 in futex_abstimed_wait_cancelable (private=0, abstime=0x0, expected=0, futex_word=0x7ffecc070b10) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
    #1  do_futex_wait (sem=sem@entry=0x7ffecc070b10, abstime=0x0) at sem_waitcommon.c:111
    #2  0x00007ffff13dd8b8 in __new_sem_wait_slow (sem=sem@entry=0x7ffecc070b10, abstime=0x0) at sem_waitcommon.c:181
    #3  0x00007ffff13dd929 in __new_sem_wait (sem=sem@entry=0x7ffecc070b10) at sem_wait.c:42
    #4  0x00007fffe374d6a2 in PyThread_acquire_lock_timed.localalias () at /home/conda/feedstock_root/build_artifacts/python-split_1642146689888/work/Python/thread_pthread.h:486
    #5  0x00007fffe3888e71 in acquire_timed () at /home/conda/feedstock_root/build_artifacts/python-split_1642146689888/work/Modules/_threadmodule.c:102
    #6  0x00007fffe3888f7b in lock_PyThread_acquire_lock () at /home/conda/feedstock_root/build_artifacts/python-split_1642146689888/work/Modules/_threadmodule.c:183
    #7  0x00007fffe37c4e57 in method_vectorcall_VARARGS_KEYWORDS () at /home/conda/feedstock_root/build_artifacts/python-split_1642146689888/work/Objects/unicodeobject.c:1404
    #8  0x00007fffe37e50ff in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=<optimized out>, args=0x7ffeb7078ef0, callable=0x7fffe3201e40, tstate=0x7ffeb0000bc0) at /home/conda/feedstock_root/build_artifacts/python-split_1642146689888/work/Include/cpython/abstract.h:123
    #9  PyObject_Vectorcall (kwnames=0x0, nargsf=<optimized out>, args=0x7ffeb7078ef0, callable=0x7fffe3201e40) at /home/conda/feedstock_root/build_artifacts/python-split_1642146689888/work/Include/cpython/abstract.h:123
    #10 call_function (kwnames=0x0, oparg=<optimized out>, pp_stack=<synthetic pointer>, trace_info=0x7ffec1251f40, tstate=<optimized out>) at /home/conda/feedstock_root/build_artifacts/python-split_1642146689888/work/Python/ceval.c:5867
    #11 _PyEval_EvalFrameDefault () at /home/conda/feedstock_root/build_artifacts/python-split_1642146689888/work/Python/ceval.c:4198
    #12 0x00007fffe37a6f95 in _PyEval_EvalFrame (throwflag=0, f=0x7ffeb7078d60, tstate=0x7ffeb0000bc0) at /home/conda/feedstock_root/build_artifacts/python-split_1642146689888/work/Python/ceval.c:4995
    #13 _PyEval_Vector (kwnames=<optimized out>, argcount=<optimized out>, args=<optimized out>, locals=0x0, con=<optimized out>, tstate=<optimized out>) at /home/conda/feedstock_root/build_artifacts/python-split_1642146689888/work/Python/ceval.c:5065
    #14 _PyFunction_Vectorcall.localalias () at /home/conda/feedstock_root/build_artifacts/python-split_1642146689888/work/Objects/call.c:342
    #15 0x00007fffe3738233 in _PyObject_VectorcallTstate (kwnames=<optimized out>, nargsf=9223372036854775808, args=0x7ffec1252168, callable=0x7fffe2f5b7f0, tstate=0x7ffeb0000bc0) at /home/conda/feedstock_root/build_artifacts/python-split_1642146689888/work/Include/cpython/abstract.h:114
    #16 PyObject_VectorcallMethod.localalias () at /home/conda/feedstock_root/build_artifacts/python-split_1642146689888/work/Objects/call.c:770
    #17 0x00007fffe387a8cc in _PyObject_CallMethodIdNoArgs (name=0x7fffe39c3970 <PyId__shutdown.15474>, self=<optimized out>) at /home/conda/feedstock_root/build_artifacts/python-split_1642146689888/work/Include/cpython/abstract.h:239
    #18 wait_for_thread_shutdown () at /home/conda/feedstock_root/build_artifacts/python-split_1642146689888/work/Python/pylifecycle.c:2823
    #19 0x00007fffe38a4670 in Py_FinalizeEx.localalias () at /home/conda/feedstock_root/build_artifacts/python-split_1642146689888/work/Python/pylifecycle.c:1719
    #20 0x00007ffff3fc543f in Realm::PythonInterpreter::~PythonInterpreter (this=0x7ffecc000b20, __in_chrg=<optimized out>) at /gpfs/fs1/mzalewski/.pyenv/versions/anaconda3-2021.11/envs/legate-dev/x86_64-conda-linux-gnu/include/c++/11.2.0/bits/runtime_impl.h:196
    #21 0x00007ffff3fc7062 in Realm::LocalPythonProcessor::destroy_interpreter (this=0x555555661bf0) at /gpfs/fs1/mzalewski/.pyenv/versions/anaconda3-2021.11/envs/legate-dev/x86_64-conda-linux-gnu/include/c++/11.2.0/bits/runtime_impl.h:801
    #22 0x00007ffff3fc65a8 in Realm::PythonThreadTaskScheduler::worker_terminate (this=0x555555661e10, switch_to=0x0) at /gpfs/fs1/mzalewski/.pyenv/versions/anaconda3-2021.11/envs/legate-dev/x86_64-conda-linux-gnu/include/c++/11.2.0/bits/runtime_impl.h:662
    #23 0x00007ffff3c6333c in Realm::ThreadedTaskScheduler::scheduler_loop (this=0x555555661e10) at /gpfs/fs1/mzalewski/.pyenv/versions/anaconda3-2021.11/envs/legate-dev/x86_64-conda-linux-gnu/include/c++/11.2.0/runtime_impl.h:1152
    #24 0x00007ffff3fc5e77 in Realm::PythonThreadTaskScheduler::python_scheduler_loop (this=0x555555661e10) at /gpfs/fs1/mzalewski/.pyenv/versions/anaconda3-2021.11/envs/legate-dev/x86_64-conda-linux-gnu/include/c++/11.2.0/bits/runtime_impl.h:395
    #25 0x00007ffff3fcb10c in Realm::Thread::thread_entry_wrapper<Realm::PythonThreadTaskScheduler, &Realm::PythonThreadTaskScheduler::python_scheduler_loop> (obj=0x555555661e10) at /gpfs/fs1/mzalewski/.pyenv/versions/anaconda3-2021.11/envs/legate-dev/x86_64-conda-linux-gnu/include/c++/11.2.0/bits/threads.h:97
    #26 0x00007ffff3c3d9a8 in Realm::KernelThread::pthread_entry (data=0x7ffeccaff9c0) at /gpfs/fs1/mzalewski/repos/quickstart-collection/legate.core/legion/bindings/python/stl_map.h:774
    #27 0x00007ffff13d46db in start_thread (arg=0x7ffec1256700) at pthread_create.c:463
    #28 0x00007ffff364371f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
    

    The conda environment:

    _libgcc_mutex             0.1                 conda_forge    conda-forge
    _openmp_mutex             4.5                       1_gnu    conda-forge
    _sysroot_linux-64_curr_repodata_hack 3                   h5bd9786_13    conda-forge
    abseil-cpp                20210324.2           h9c3ff4c_0    conda-forge
    arrow-cpp                 6.0.1           py310h500f8fe_8_cpu    conda-forge
    aws-c-cal                 0.5.11               h95a6274_0    conda-forge
    aws-c-common              0.6.2                h7f98852_0    conda-forge
    aws-c-event-stream        0.2.7               h3541f99_13    conda-forge
    aws-c-io                  0.10.5               hfb6a706_0    conda-forge
    aws-checksums             0.1.11               ha31a3da_7    conda-forge
    aws-sdk-cpp               1.8.186              hb4091e7_3    conda-forge
    binutils_impl_linux-64    2.36.1               h193b22a_2    conda-forge
    binutils_linux-64         2.36                 hf3e587d_4    conda-forge
    bzip2                     1.0.8                h7f98852_4    conda-forge
    c-ares                    1.18.1               h7f98852_0    conda-forge
    ca-certificates           2021.10.8            ha878542_0    conda-forge
    cffi                      1.15.0          py310h0fdd8cc_0    conda-forge
    cuda-cccl                 11.6.55              hdc25635_0    nvidia
    cuda-compiler             11.6.0               hde35cc3_0    nvidia
    cuda-cudart               11.6.55              he381448_0    nvidia
    cuda-cudart-dev           11.6.55              h42ad0f4_0    nvidia
    cuda-cuobjdump            11.6.55              h9dd2d0c_0    nvidia
    cuda-cuxxfilt             11.6.55              h69de05d_0    nvidia
    cuda-driver-dev           11.6.55                       0    nvidia
    cuda-libraries-dev        11.6.0               hde35cc3_0    nvidia
    cuda-nvcc                 11.6.55              h5758ece_0    nvidia
    cuda-nvprune              11.6.55              h3791f62_0    nvidia
    cuda-nvrtc                11.6.55              hc54fff9_0    nvidia
    cuda-nvrtc-dev            11.6.55              h42ad0f4_0    nvidia
    cudatoolkit               11.6.0              habf752d_10    conda-forge
    cutensor                  1.4.0.6              h7537e88_1    conda-forge
    gcc                       11.2.0               h702ea55_4    conda-forge
    gcc_impl_linux-64         11.2.0              h82a94d6_12    conda-forge
    gcc_linux-64              11.2.0               h39a9532_4    conda-forge
    gflags                    2.2.2             he1b5a44_1004    conda-forge
    gfortran                  11.2.0               h8811e0c_4    conda-forge
    gfortran_impl_linux-64    11.2.0              h7a446d4_12    conda-forge
    gfortran_linux-64         11.2.0               h777b47f_4    conda-forge
    glog                      0.5.0                h48cff8f_0    conda-forge
    grpc-cpp                  1.42.0               ha1441d3_1    conda-forge
    gxx                       11.2.0               h702ea55_4    conda-forge
    gxx_impl_linux-64         11.2.0              h82a94d6_12    conda-forge
    kernel-headers_linux-64   3.10.0              h4a8ded7_13    conda-forge
    krb5                      1.19.2               hcc1bbae_3    conda-forge
    ld_impl_linux-64          2.36.1               hea4e1c9_2    conda-forge
    libblas                   3.9.0           13_linux64_openblas    conda-forge
    libbrotlicommon           1.0.9                h7f98852_6    conda-forge
    libbrotlidec              1.0.9                h7f98852_6    conda-forge
    libbrotlienc              1.0.9                h7f98852_6    conda-forge
    libcblas                  3.9.0           13_linux64_openblas    conda-forge
    libcublas                 11.8.1.74            h1e58c10_0    nvidia
    libcublas-dev             11.8.1.74            h7a51e1f_0    nvidia
    libcufft                  10.7.0.55            h563f203_0    nvidia
    libcufft-dev              10.7.0.55            h05eb8d0_0    nvidia
    libcurand                 10.2.9.55            h7c349da_0    nvidia
    libcurand-dev             10.2.9.55            hd2e71f0_0    nvidia
    libcurl                   7.81.0               h2574ce0_0    conda-forge
    libcusolver               11.3.2.107           hc875929_0    nvidia
    libcusolver-dev           11.3.2.107           h78cb71c_0    nvidia
    libcusparse               11.7.1.55            h9a152cf_0    nvidia
    libcusparse-dev           11.7.1.55            h02e612c_0    nvidia
    libedit                   3.1.20191231         he28a2e2_2    conda-forge
    libev                     4.33                 h516909a_1    conda-forge
    libevent                  2.1.10               h9b69904_4    conda-forge
    libffi                    3.4.2                h7f98852_5    conda-forge
    libgcc-devel_linux-64     11.2.0              h0952999_12    conda-forge
    libgcc-ng                 11.2.0              h1d223b6_12    conda-forge
    libgfortran-ng            11.2.0              h69a702a_11    conda-forge
    libgfortran5              11.2.0              h5c6108e_11    conda-forge
    libgomp                   11.2.0              h1d223b6_12    conda-forge
    liblapack                 3.9.0           13_linux64_openblas    conda-forge
    libnghttp2                1.46.0               h812cca2_0    conda-forge
    libnpp                    11.6.0.55            hdb0c674_0    nvidia
    libnpp-dev                11.6.0.55            h0163868_0    nvidia
    libnsl                    2.0.0                h7f98852_0    conda-forge
    libnvjpeg                 11.6.0.55            h6f17e28_0    nvidia
    libnvjpeg-dev             11.6.0.55            h0163868_0    nvidia
    libopenblas               0.3.18          pthreads_h8fe5266_0    conda-forge
    libprotobuf               3.19.3               h780b84a_0    conda-forge
    libsanitizer              11.2.0              he4da1e4_12    conda-forge
    libssh2                   1.10.0               ha56f1ee_2    conda-forge
    libstdcxx-devel_linux-64  11.2.0              h0952999_12    conda-forge
    libstdcxx-ng              11.2.0              he4da1e4_12    conda-forge
    libthrift                 0.15.0               he6d91bd_1    conda-forge
    libutf8proc               2.7.0                h7f98852_0    conda-forge
    libuuid                   2.32.1            h7f98852_1000    conda-forge
    libzlib                   1.2.11            h36c2ea0_1013    conda-forge
    lz4-c                     1.9.3                h9c3ff4c_1    conda-forge
    ncurses                   6.2                  h58526e2_4    conda-forge
    numpy                     1.22.1          py310h454958d_0    conda-forge
    openssl                   1.1.1l               h7f98852_0    conda-forge
    opt_einsum                3.3.0              pyhd8ed1ab_1    conda-forge
    orc                       1.7.2                h1be678f_0    conda-forge
    parquet-cpp               1.5.1                         2    conda-forge
    pip                       21.3.1             pyhd8ed1ab_0    conda-forge
    pyarrow                   6.0.1           py310h1a3fb3d_8_cpu    conda-forge
    pycparser                 2.21               pyhd8ed1ab_0    conda-forge
    python                    3.10.2          h62f1059_0_cpython    conda-forge
    python_abi                3.10                    2_cp310    conda-forge
    re2                       2021.11.01           h9c3ff4c_0    conda-forge
    readline                  8.1                  h46c0cb4_0    conda-forge
    s2n                       1.0.10               h9b69904_0    conda-forge
    setuptools                60.5.0          py310hff52083_0    conda-forge
    snappy                    1.1.8                he1b5a44_3    conda-forge
    sqlite                    3.37.0               h9cd32fc_0    conda-forge
    sysroot_linux-64          2.17                h4a8ded7_13    conda-forge
    tk                        8.6.11               h27826a3_1    conda-forge
    tzdata                    2021e                he74cb21_0    conda-forge
    wheel                     0.37.1             pyhd8ed1ab_0    conda-forge
    xz                        5.2.5                h516909a_1    conda-forge
    zlib                      1.2.11            h36c2ea0_1013    conda-forge
    zstd                      1.5.2                ha95c52a_0    conda-forge
    

    @m3vaz tried it with a slightly different but quite similar environment, and he observed the same issue.

    bug 
    opened by marcinz 10
  • Using a scalar in `allclose` raises an `AttributeError`

    Using a scalar in `allclose` raises an `AttributeError`

    Problem

    Using a scalar in allclose raises AttributeError: PROJ_1D_1D_.

    To reproduce

    1. step 1: create test.py
      from legate import numpy as lnp
      import numpy as realnp
      
      # vanilla numpy works
      a = realnp.full(10, 1e-1)
      print(realnp.allclose(a, 1e-1))
      
      # legate numpy not working
      la = lnp.full(10, 1e-1)
      print(lnp.allclose(la, 1e-1))
      
    2. step 2: run test.py with legate --cpus 1 ./test.py -lg:numpy:test

    Output

    The first part that uses vanilla NumPy prints True.

    The second part that uses Legate NumPy raises:

    Traceback (most recent call last):
      File "<prefix>/lib/python3.8/site-packages/legion_top.py", line 408, in legion_python_main
        run_path(args[start], run_name='__main__')
      File "<prefix>/lib/python3.8/site-packages/legion_top.py", line 200, in run_path
        exec(code, module.__dict__, module.__dict__)
      File "./test.py", line 10, in <module>
        print(lnp.allclose(la, 1e-1))
      File "<prefix>/lib/python3.8/site-packages/legate/numpy/module.py", line 459, in allclose
        return ndarray.perform_binary_reduction(
      File "<prefix>/lib/python3.8/site-packages/legate/numpy/array.py", line 2068, in perform_binary_reduction
        dst._thunk.binary_reduction(
      File "<prefix>/lib/python3.8/site-packages/legate/numpy/deferred.py", line 5167, in binary_reduction
        ) = self.runtime.compute_broadcast_transform(
      File "<prefix>/lib/python3.8/site-packages/legate/numpy/runtime.py", line 2500, in compute_broadcast_transform
        self.first_proj_id + getattr(NumPyProjCode, proj_name),
      File "<prefix>/lib/python3.8/enum.py", line 384, in __getattr__
        raise AttributeError(name) from None
    AttributeError: PROJ_1D_1D_
    

    Expected behavior

    Working like vanilla NumPy, or raising an exception with a clear message of what is not supported.

    opened by piyueh 10
  • Issues building cuNumeric on Mac OS 13.0.1 Ventura

    Issues building cuNumeric on Mac OS 13.0.1 Ventura

    Documenting a difficult day of fighting with Apple compilers after an OS upgrade. The high level bit is look here if you are experiencing hangs during cuNumeric builds, specifically as part of the TBLIS sub-build.

    Symptoms:

    I noticed various parts of the TBLIS build would get stuck, such as after this line:

        -- cunumeric: ENV{CC}="/usr/local/opt/llvm/bin/clang"
        -- cunumeric: ENV{CXX}="/usr/local/opt/llvm/bin/clang++"
    

    or this line:

       -- Build files have been written to: /Users/rohany/Documents/nvidia/cunumeric/_skbuild/macosx-13.0-x86_64-3.9/cmake-build
        [0/56] Building tblis
        Making install in src/external/tci
        make[1]: Entering directory '/Users/rohany/Documents/nvidia/cunumeric/_skbuild/macosx-13.0-x86_64-3.9/cmake-build/_deps/tblis-src/src/external/tci'
          CC       tci/barrier.lo
          CC       tci/context.lo
          CC       tci/mutex.lo
          CC       tci/parallel.lo
          CC       tci/slot.lo
          CC       tci/work_item.lo
          CC       tci/communicator.lo
          CC       tci/task_set.lo
          CXXLD    lib/libtci.la
        make[2]: Entering directory '/Users/rohany/Documents/nvidia/cunumeric/_skbuild/macosx-13.0-x86_64-3.9/cmake-build/_deps/tblis-src/src/external/tci'
         ./install-sh -c -d '/Users/rohany/Documents/nvidia/cunumeric/_skbuild/macosx-13.0-x86_64-3.9/cmake-build/_deps/tblis-build/lib'
         ./install-sh -c -d '/Users/rohany/Documents/nvidia/cunumeric/_skbuild/macosx-13.0-x86_64-3.9/cmake-build/_deps/tblis-build/include'
         /bin/sh ./libtool   --mode=install /usr/bin/install -c   lib/libtci.la '/Users/rohany/Documents/nvidia/cunumeric/_skbuild/macosx-13.0-x86_64-3.9/cmake-build/_deps/tblis-build/lib'
         ./install-sh -c -d '/Users/rohany/Documents/nvidia/cunumeric/_skbuild/macosx-13.0-x86_64-3.9/cmake-build/_deps/tblis-build/include/tci'
        libtool: install: /usr/bin/install -c lib/.libs/libtci.0.dylib /Users/rohany/Documents/nvidia/cunumeric/_skbuild/macosx-13.0-x86_64-3.9/cmake-build/_deps/tblis-build/lib/libtci.0.dylib
        libtool: install: (cd /Users/rohany/Documents/nvidia/cunumeric/_skbuild/macosx-13.0-x86_64-3.9/cmake-build/_deps/tblis-build/lib && { ln -s -f libtci.0.dylib libtci.dylib || { rm -f libtci.dylib && ln -s libtci.0.dylib libtci.dylib; }; })
         /usr/bin/install -c -m 644  tci/tci_config.h tci/tci_global.h tci/barrier.h tci/communicator.h tci/context.h tci/mutex.h tci/parallel.h tci/pipeline.h tci/slot.h tci/task_set.h tci/work_item.h tci/yield.h '/Users/rohany/Documents/nvidia/cunumeric/_skbuild/macosx-13.0-x86_64-3.9/cmake-build/_deps/tblis-build/include/tci'
        libtool: install: /usr/bin/install -c lib/.libs/libtci.lai /Users/rohany/Documents/nvidia/cunumeric/_skbuild/macosx-13.0-x86_64-3.9/cmake-build/_deps/tblis-build/lib/libtci.la
         /usr/bin/install -c -m 644  tci.h '/Users/rohany/Documents/nvidia/cunumeric/_skbuild/macosx-13.0-x86_64-3.9/cmake-build/_deps/tblis-build/include/.'
        libtool: install: /usr/bin/install -c lib/.libs/libtci.a /Users/rohany/Documents/nvidia/cunumeric/_skbuild/macosx-13.0-x86_64-3.9/cmake-build/_deps/tblis-build/lib/libtci.a
        libtool: install: chmod 644 /Users/rohany/Documents/nvidia/cunumeric/_skbuild/macosx-13.0-x86_64-3.9/cmake-build/_deps/tblis-build/lib/libtci.a
        libtool: install: ranlib /Users/rohany/Documents/nvidia/cunumeric/_skbuild/macosx-13.0-x86_64-3.9/cmake-build/_deps/tblis-build/lib/libtci.a
        make[2]: Leaving directory '/Users/rohany/Documents/nvidia/cunumeric/_skbuild/macosx-13.0-x86_64-3.9/cmake-build/_deps/tblis-src/src/external/tci'
        make[1]: Leaving directory '/Users/rohany/Documents/nvidia/cunumeric/_skbuild/macosx-13.0-x86_64-3.9/cmake-build/_deps/tblis-src/src/external/tci'
        make[1]: Entering directory '/Users/rohany/Documents/nvidia/cunumeric/_skbuild/macosx-13.0-x86_64-3.9/cmake-build/_deps/tblis-src'
    

    At this point, things started to act strange in my laptop. Commands like gcc --version would hang. Interestingly then, clang could generate valid executables -- running clang on a just int main() { return 0; } would generate a binary that would hang before entering main. This weirdness would then go away once I restarted my computer. However, the build would get stuck in the same place.

    I lucked onto the following solution, which is really a workaround because the cuNumeric build still runs into the above problem for me.

    • First, I reinstalled x-code command line tools (there are several resources about how to do this).
    • After this, a build of TBLIS outside of the cuNumeric build did in fact complete (using the same configure that cuNumeric emits, followed by make -j; make install.
    • Then, I pointed cunumeric at this build with --with-tblis ... to install.py.

    Even though the external TBLIS build completed, running the cuNumeric build that compiles its own TBLIS results in putting my laptop back in this broken state.

    opened by rohany 2
  • NumPy and cuNumeric behave differently in 4 cases in cunumeric.prod()

    NumPy and cuNumeric behave differently in 4 cases in cunumeric.prod()

    NumPy and cuNumeric behave differently for the following cases. Some of the cases need to be fixed while the others are expected divergences.

    Case-1

    NEGATIVE_DTYPE = ["h", "i", "H", "I", "?", "b", "B"]
    for dtype in NEGATIVE_DTYPE:
            size = (5, 5, 5)
            arr = np.random.random(size) * 10 + 2
            arr_np = np.array(arr, dtype=dtype)
            arr_num = num.array(arr_np)
            out_np = np.prod(arr_np)      # Numpy return product of all datas
            out_num = num.prod(arr_num)   # cuNumeric return an array with a different data
    

    Case-2

            arr = (num.random.rand(5, 5) * 10 + 2) + (num.random.rand(5, 5) * 10 * 1.0j + 0.2j)
            arr_np = np.array(arr, dtype=complex128)
            arr_num = num.array(arr_np)
            out_np = np.prod(arr_np)
            out_num = num.prod(arr_num)
    

    They have diffrent output data.

    Case-3

            arr = [[1, 2], [3, 4]]
            initial_value = [3]
    
            out_num = num.prod(arr, initial=initial_value)  
            out_np = np.prod(arr, initial=initial_value)
    

    cuNumeric returns array(72). Numpy raises ValueError: Input object to FillWithScalar is not a scalar

    Case-4

    SIZE_E = [
        (1, 1),
        (1, DIM),
        (DIM, 1),
        (1, 1, 1),
        (DIM, 1, 1),
        (1, DIM, 1),
        (1, 1, DIM)
    ]
    for size in SIZE_E:
            arr_np = np.random.random(size)
            arr_num = num.array(arr_np)
            ndim = arr_np.ndim
            for axis in range(-ndim + 1, ndim, 1):
                out_np = np.prod(arr_np, axis=axis, keepdims=True)
                out_num = num.prod(arr_num, axis=axis, keepdims=True)
    

    Error is :

      # in cunumeric/deferred/unary_reduction:
      # if lhs_array.size == 1:
      #     > assert axes is None or len(axes) == rhs_array.ndim - (
      #         0 if keepdims else lhs_array.ndim
      #     )
      # E    AssertionError
    
    opened by xialu00 0
  • NumPy and cuNumeric behave differently in 5 cases in test_multi_dot.py and test_dot.py

    NumPy and cuNumeric behave differently in 5 cases in test_multi_dot.py and test_dot.py

    NumPy and cuNumeric behave differently for the following cases. Some of the cases need to be fixed while the others are expected divergences.

    Case-1 A = mk_0to1_array(num, (2, 2)) B = mk_0to1_array(num, (2, 2)) C = mk_0to1_array(num, (2, 2)) arrays = [A, B, C] out = num.zeros((2, 1)) num.linalg.multi_dot(self.arrays, out=out)

    In Numpy, it raises ValueError In cuNumeric, it raises AssertionError

    Case-2 A = mk_0to1_array(num, (2, 2)) B = mk_0to1_array(num, (2, 2)) C = mk_0to1_array(num, (2, 2)) arrays = [A, B, C] out = num.zeros((2, 2), dtype=np.float32)

    In Numpy, it raises ValueError In cuNumeric, it pass

    Case-3 A = mk_0to1_array(num, (2, 2)) B = mk_0to1_array(num, (2, 2)) C = mk_0to1_array(num, (2, 2)) arrays = [A, B, C] out = num.zeros((2, 2), dtype=np.int64)

    In Numpy, it raises ValueError In cuNumeric, it raises TypeError: Unsupported type: int64

    Case-4 A = mk_0to1_array(num, (5, 3)) B = mk_0to1_array(num, (3, 2)) out = np.zeros((5, 2), dtype=np.float32) np.dot(A, B, out=out)

    In Numpy, it raises ValueError In cuNumeric, it pass

    Case-5 A = mk_0to1_array(num, (5, 3)) B = mk_0to1_array(num, (3, 2)) out = np.zeros((5, 2), dtype=np.int64) np.dot(A, B, out=out)

    In Numpy, it raises ValueError In cuNumeric, it raises TypeError: Unsupported type: int64

    opened by robinw0928 0
  • TestDotErrors::test_out_invalid_dtype[(dtype=numpy.int64)] triggers runtime Error

    TestDotErrors::test_out_invalid_dtype[(dtype=numpy.int64)] triggers runtime Error

    test command: +LEGATE_TEST=1 /opt/conda/envs/legate/bin/legate /legate/test/tests/integration/test_dot.py -cunumeric:test --fbmem 4096 --gpus 1 --gpu-bind 0 -v [PASS] (GPU) tests/integration/test_dot.py

    Below test trigger runtime Error. If we remove this test, the runtime Error goes aways. tests/integration/test_dot.py::TestDotErrors::test_out_invalid_dtype[(dtype=<class 'numpy.int64'>)] XFAIL [100%] Error message:
    ======================== 31 passed, 2 xfailed in 3.18s ========================= Exception ignored in: <function RegionField.del at 0x7fc08c0e01f0> Traceback (most recent call last): File "/opt/conda/envs/legate/lib/python3.9/site-packages/legate/core/store.py", line 126, in del File "/opt/conda/envs/legate/lib/python3.9/site-packages/legate/core/store.py", line 251, in detach_external_allocation File "/opt/conda/envs/legate/lib/python3.9/site-packages/legate/core/runtime.py", line 570, in detach_external_allocation File "/opt/conda/envs/legate/lib/python3.9/site-packages/legate/core/runtime.py", line 555, in _remove_allocation File "/opt/conda/envs/legate/lib/python3.9/site-packages/legate/core/runtime.py", line 550, in _remove_attachment RuntimeError: Unable to find attachment to remove

    opened by robinw0928 0
  • Fft improvements

    Fft improvements

    This PR is a re-factorization of the cufft wrapper task in order to

    remove the 3D-FFT limit
    allow for data parallel execution if feasible
    

    NOTE: This is a new PR based on #700 which had trouble with the CI due to the branch name.

    category:improvement 
    opened by mfoerste4 0
Releases(v22.10.00)
  • v22.10.00(Oct 13, 2022)

    The biggest change in Release 22.10 is a new build infrastructure using CMake and scikit-build. The new build system brings several benefits including robust build dependency tracking and compliance with Python site-packages. This release includes several new search and indexing operators, fixes for several performance and correctness bugs, and provenance tracking for top-level and ndarray routines in execution profiles.

    Conda packages for this release are available at https://anaconda.org/legate/cunumeric.

    What's Changed

    🚀 New Features

    • Argwhere and flatnonzero by @mfoerste4 in #525

    • added extract and place via advanced indexing by @mfoerste4 in #536
    • Fill diagonal by @ipdemes in #473
    • Single processor implementation for linalg.solve by @magnatelee in #568

    🛠️ Improvements

    • adding support for array shape () passed as an index argument in advanced indexing by @ipdemes in #486
    • Refactor test driver for cpu/gpu sharding by @bryevdv in #451
    • Collate test output to allow workers > 1 with verbose output by @bryevdv in #507
    • Ensure test.py --use flag fully overrides USE_* envvars by @manopapad in #524
    • Enhance two integration tests by @robinw0928 in #511
    • Add typing to array.py by @bryevdv in #478
    • Update test runner for osx by @bryevdv in #529
    • Don't blindly trust user-supplied bincount.minlength by @manopapad in #523
    • Make reduced-precision cuBLAS mode opt-in by @manopapad in #519
    • Fix reciprocal tests for zero values and improve test value customization (#467) by @marcinz in #537
    • Refactor test runner to support more pinning options by @bryevdv in #535
    • Remove dead code ian bincount by @magnatelee in #546
    • Make the validation condition for random distributions lenient by @magnatelee in #550
    • src/cunumeric: handle high number of bins in GPU bincount by @rohany in #526
    • Construct NumPy arrays correctly from 0D deferred arrays backed by region fields by @magnatelee in #551
    • Collect test failure details at the end by @bryevdv in #556
    • Simplify some thunk conversion helpers by @manopapad in #553
    • Fix a compiler warning by @magnatelee in #555
    • Add option to disable CPU pinning in tests by @bryevdv in #558
    • Use the new mapper registration to enable detailed mapper logging by @magnatelee in #570
    • src/cunumeric/search: make nonzero not always allocate SYS_MEM buffers by @rohany in #572
    • add negative test case in test_array_split.py by @xialu00 in #545
    • add some test cases for test_arg_reduce.py by @xialu00 in #575
    • Testcase-add test cases for test_flip and test_indices by @xialu00 in #579
    • Refactor scalar reductions to use common execution policy by @jjwilke in #573
    • Sanitize k for the eye operator by @magnatelee in #586
    • Add CMake build for C++ and scikit-build infrastructure for Python package installation by @jjwilke in #514
    • Enhance test_block.py and test_eye.py by @robinw0928 in #578
    • Testcase add test cases for test_fill.py and test_ndim.py by @xialu00 in #588
    • Remove run dependency on curand by @marcinz in #520
    • Use Legion Fills when possible by @manopapad in #604
    • Support building with GASNet-Ex and MPI backends by @manopapad in #610
    • Provenance tracking for cuNumeric operators by @magnatelee in #596
    • Fix tests utils to make --directory work correctly. by @robinw0928 in #592
    • Fix a compiler warning by @magnatelee in #594
    • Enhance test_diag_indices.py and test_flatten.py. by @robinw0928 in #609
    • cuNumeric doesn't need nested provenance tracking by @magnatelee in #617
    • Add RuntimeError exception to legate.time by @robinw0928 in #618
    • Stop instantiating min and max reduction ops for complex types by @magnatelee in #621
    • Mark temporary conversion outputs as linear for eager storage recycling by @magnatelee in #608
    • Make the negative test on fill robust across Python versions by @magnatelee in #619
    • Enhance mask_indices and move_axis by @robinw0928 in #622
    • src/cunumeric/matrix: stop including coll.h in solve_template.inl by @rohany in #620

    🐛 Bug Fixes

    • Fix performance bugs in scalar reductions by @magnatelee in #509
    • Don't use internal LAPACK function names by @manopapad in #522
    • Bug fixes for advanced indexing by @magnatelee in #532
    • Handle the case where LAPACK_*potrf is a macro, not a function by @manopapad in #527
    • fix mypy issue w/ np methods by @bryevdv in #542
    • Fix buggy complex-to-bool conversions and add correctness tests for astype by @magnatelee in #549
    • fixing advanced indexing operation for empty arrays by @ipdemes in #504
    • Do not link curand by @marcinz in #541
    • Fixing issues with advanced_indexing_kernel by @ipdemes in #557
    • fixing another corner case for advanced indexing by @ipdemes in #554
    • Fix OSX test shard generation by @bryevdv in #563
    • fix error print in test_unary_ufunc by @jjwilke in #566
    • Add NAN handling to convert() needed for some prefix routines with integer outputs. by @rkarim2 in #502
    • Fixing logic for slicing by @ipdemes in #574
    • Fix linalg.solve when inputs are scalars by @magnatelee in #585
    • Allow casting in cn.dot, to match numpy's behavior by @manopapad in #598
    • Add linalg.solve to the cmake build by @magnatelee in #603
    • Invoke eye with read-write privilege, not write-discard by @manopapad in #616
    • Fix a bug in scalar reduction launching kernels with empty domains by @magnatelee in #606

    📖 Documentation

    • Added note to prefix documentation for corner cases where cunumeric results can diverge from numpy by @rkarim2 in #528
    • updating documentation by @ipdemes in #614
    • Add missing docs symlink by @bryevdv in #635
    Source code(tar.gz)
    Source code(zip)
  • v22.08.00(Aug 9, 2022)

    Release 22.08.00 features a variety of random distribution implementations (backed by cuRAND), distributed prefix scan operators, and a complete implementation of sorting for multi-node multi-CPU execution. This release also includes several quality-of-life changes and bug fixes, including type annotations for all but one Python module, improvements to the parallel test driver, fixes for several operators when inputs are empty, and proper handling of ndarrays passed as array sizes or indices.

    Conda packages for this release are available at https://anaconda.org/legate/cunumeric.

    New Features

    • Adding support for ND output regions in Advanced Indexing task by @ipdemes in #370
    • added support for 'searchsorted' by @mfoerste4 in #414
    • np.packbits and np.unpackbits by @magnatelee in #427
    • Implementation of atleast_{1,2,3}d by @sbak5 in #404
    • Implementing cunumeric.random.BitGenerator by @fduguet-nv in #254
    • Adding support for some simple _indices routines by @ipdemes in #417
    • adding mask_indices routine by @ipdemes in #426
    • Random advanced distributions by @fduguet-nv in #470
    • Distributed nd sort for cpu/omp by @mfoerste4 in #437
    • Initial implementation of scan routines. by @rkarim2 in #425
    • Adding support for take_along_axis and put_along_axis by @ipdemes in #436
    • cunumeric.ndim by @magnatelee in #495
    • Add support for curand conda package build (cherry pick #510) by @marcinz in #512

    Improvements

    • Don't run the resolution logic if the arrays have the same dtype by @magnatelee in #389
    • Set cuda virtual package as hard run requirement for gpu conda package by @m3vaz in #398
    • First pass mypy typing by @bryevdv in #387
    • Generalize Dict to Mapping for newer versions of mypy by @jjwilke in #405
    • Add support for using cupy in sort.py by @robinw0928 in #395
    • Refactor test.py by @bryevdv in #378
    • Use Numpy axis normalizations where possible by @bryevdv in #419
    • More mypy by @bryevdv in #413
    • adding bounds check for advanced indexing by @ipdemes in #397
    • Report Elapsed Time in cholesky's output by @SeyedMir in #423
    • Support -vv for more verbose test output by @bryevdv in #432
    • Add typing to runtime.py by @bryevdv in #428
    • Update compress/take tests for pytest by @bryevdv in #435
    • Project down to a 1D store for the scalar reduction output by @magnatelee in #455
    • Fallback to self = np.ndarray when necessary by @bryevdv in #431
    • Add types to thunk modules by @bryevdv in #438
    • allclose detail + misc tests improvements by @bryevdv in #457
    • cunumeric.random - Adding Module-scoped functions by @fduguet-nv in #481
    • Activate the NumPy fallback for cunumeric.random in CPU build by @magnatelee in #485
    • Legacy generators for cpu build by @magnatelee in #487
    • Allow CPU build to optionally use cuRAND by @magnatelee in #498
    • Sanitize shapes in ndarray's constructor by @magnatelee in #496
    • src/cunumeric/sort: stop using std::{inclusive, exclusive}_scan by @rohany in #499
    • Update conda requirements by @manopapad in #383
    • Handle dtype/casting/out properly in contractions by @manopapad in #402
    • Missing / overzealous check_eager_args calls by @manopapad in #465
    • Strengthen some types by @manopapad in #468

    Bug Fixes

    • Add missing includes to aid intellisense providers by @trxcllnt in #382
    • Proper exception handling for cholesky by @magnatelee in #391
    • Fixes for building with setup.py outside conda, primarily Mac by @jjwilke in #394
    • Use the right API to check if the store is unbound by @magnatelee in #399
    • Fix nargs for report:dump-csv by @bryevdv in #400
    • Handle empty outputs correctly in advanced indexing task by @magnatelee in #396
    • Fall back to NumPy in array_function and array_ufunc by @magnatelee in #424
    • Fix for legate data interface by @magnatelee in #429
    • Fix test_floating.py test to call sys.exit by @marcinz in #433
    • Make missing pynvml an error for GPU tests by @bryevdv in #441
    • Make the NumPy fallback work correctly in randint by @magnatelee in #450
    • Squeeze fix by @magnatelee in #448
    • Correctly prune out empty tasks in binary reduction by @magnatelee in #453
    • Minor fix for indexing routines by @magnatelee in #452
    • Make DeferredArray.reshape always return a deferred array by @magnatelee in #454
    • Re-freezing conda compiler versions (#415) by @m3vaz in #449
    • Fix for floating point predicates by @magnatelee in #466
    • markdown version fix by @ipdemes in #459
    • Fixup typing regressions by @bryevdv in #471
    • Remove ill-defined advanced indexing test case by @magnatelee in #484
    • Handle empty inputs correctly in local scan tasks by @magnatelee in #491
    • Handle an unknown in a tuple correctly in reshape by @magnatelee in #490
    • fix mismatched size_t/uint64_t types by @jjwilke in #475
    • Allow scalar cunumeric ndarrays as array indices by @manopapad in #479

    Documentation

    • adding new version for documentations by @ipdemes in #447
    • Updates to api_compare.py by @bryevdv in #456
    • Be stricter applying CuWrapperMetadata by @bryevdv in #463
    • Add custom nitpicky ref checks for cunumeric APIs by @bryevdv in #462
    • Docs coverage check by @bryevdv in #469
    • Fix the API reference for random functions and scan operators by @magnatelee in #497

    New Contributors

    • @jjwilke made their first contribution in https://github.com/nv-legate/cunumeric/pull/394
    • @SeyedMir made their first contribution in https://github.com/nv-legate/cunumeric/pull/423
    • @fduguet-nv made their first contribution in https://github.com/nv-legate/cunumeric/pull/254
    • @rkarim2 made their first contribution in https://github.com/nv-legate/cunumeric/pull/425
    • @rohany made their first contribution in https://github.com/nv-legate/cunumeric/pull/499

    Full Changelog: https://github.com/nv-legate/cunumeric/compare/v22.05.02...v22.08.00

    Source code(tar.gz)
    Source code(zip)
  • v22.05.02(Jun 21, 2022)

    This hotfix release fixes issues in conda recipes.

    What's Changed

    • Cherry pick: Update conda requirements (#383) by @marcinz in https://github.com/nv-legate/cunumeric/pull/406
    • Cherry pick: Set cuda virtual package as hard run requirement for conda gpu package (#398) by @marcinz in https://github.com/nv-legate/cunumeric/pull/407
    • Cherry pick: Fix nargs for report:dump-csv (#400) by @marcinz in https://github.com/nv-legate/cunumeric/pull/408
    • Re-freezing conda compiler versions by @m3vaz in https://github.com/nv-legate/cunumeric/pull/415

    Full Changelog: https://github.com/nv-legate/cunumeric/compare/v22.05.01...v22.05.02

    Source code(tar.gz)
    Source code(zip)
  • v22.05.01(Jun 16, 2022)

    This hotfix release updates the conda build recipe to make the cuNumeric package depend on the right version of NumPy and also fixes a bug in the command-line argument parser.

    Full Changelog: https://github.com/nv-legate/cunumeric/compare/v22.05.00...v22.05.01

    Source code(tar.gz)
    Source code(zip)
  • v22.05.00(Jun 7, 2022)

    Release 22.05 features complete support for advanced indexing and related indexing routines (compress and take), a multi-node multi-GPU sorting implementation for multi-dimensional ndarrays, window functions, several matrix/tensor operations (trace, matrix_power, multi_dot, and einsum_path) and primitive support for FFT on a single GPU using cuFFT.

    Conda packages for this release are available at https://anaconda.org/legate/cunumeric.

    New Features

    • thrust allocator for sort by @mfoerste4 in #228
    • implementation of np.block w/ a test by @sbak5 in #213
    • Window functions by @magnatelee in #283
    • Advanced indexing by @ipdemes in #235
    • First implementation of single-GPU FFT using cuFFT by @mferreravila in #238
    • Use the stream pool in Legate core by @magnatelee in #295
    • Add partition api and utilize sort backend by @mfoerste4 in #287
    • implementing TRACE operation by @ipdemes in #263
    • adding support for negative indices in advanced indexing by @ipdemes in #322
    • Add cpu-only packages to the conda variants by @m3vaz in #330
    • Bump minpy to 3.8 (conda env and recipe) by @bryevdv in #332
    • Remaning ufuncs by @magnatelee in #315
    • Logic functions by @magnatelee in #347
    • Slicing-based np.block implementation by @sbak5 in #306
    • Implement matrix_power by @manopapad in #360
    • Distributed N-dimensional sort by @mfoerste4 in #316
    • Implement einsum_path by @manopapad in #361
    • adding diag_indices and diag_indices_from routines by @ipdemes in #367
    • Implement moveaxis by @manopapad in #364
    • Implement __array_function __and array_ufunc by @manopapad in #353
    • Implement more norm cases by @manopapad in #366
    • Implement multi_dot by @manopapad in #358
    • Adding support for "indices" routine by @ipdemes in #368
    • Support axis=None and keepdims=True/False in argmin and argmax by @trxcllnt in #346

    Improvements

    • Move the ufunc module (ported to branch-22.05) by @magnatelee in #242
    • Use ufuncs in special methods by @magnatelee in #247
    • Initial unit tests by @bryevdv in #229
    • Revise type coercion by @magnatelee in #264
    • adding 'only' option to the tests.py by @ipdemes in #248
    • Updates for using the new unbound store API by @magnatelee in #265
    • Don't run the resolution logic if the arrays have the same dtype (ported to 22.05) by @magnatelee in #390
    • Use find_packages for installation by @magnatelee in #269
    • Some misc tests and types by @bryevdv in #268
    • Forward-port #257 by @manopapad in #273
    • Split up sort.cu for parallel compilation by @magnatelee in #277
    • Debugging checks by @magnatelee in #281
    • Update example programs by @magnatelee in #289
    • Bump up NumPy version by @magnatelee in #291
    • Don't use constexpr for window functions by @magnatelee in #294
    • Better error message on unsupported complex reductions by @manopapad in #300
    • handle coverage wrapping uniformly including ufuncs by @bryevdv in #272
    • Architecture-agnostic check for int128 by @manopapad in #293
    • Unit test fixups by @bryevdv in #303
    • reduce testcases for partition test by @mfoerste4 in #304
    • Adding conda build recipe files by @marcinz in #274
    • Use pytest for test running by @bryevdv in #297
    • Add unit tests to test.py by @marcinz in #305
    • Change _cunumeric_implemented into a dataclass by @manopapad in #318
    • Pass reporting explicity to coverage decorators by @bryevdv in #333
    • FFT refactoring by @magnatelee in #310
    • Declare ufunc formatter to be safe for parallel read by @magnatelee in #335
    • Force installation of Lapack in OpenBLAS build by @marcinz in #266
    • Mark no out-of-range indices for copies by @magnatelee in #336
    • Discussion PR for conda envs split by @bryevdv in #326
    • Use 64-bit integers for global thread ids by @magnatelee in #349
    • Use legate.core arg parsing by @bryevdv in #343
    • adding compress and take operations by @ipdemes in #296
    • Conda recipes improvements by @marcinz in #345
    • Misc small updates by @bryevdv in #352
    • adding performance tests for indexing routines by @ipdemes in #337
    • Add support for using cupy by @robinw0928 in #373

    Bug Fixes

    • Forward port late commits from 22.03 by @bryevdv in #241
    • Catch up the ufunc renaming (ported to 22.05) by @magnatelee in #244
    • Activate the cuBLAS workaround by checking the cuBLAS version at runtime (ported to 22.05) by @magnatelee in #246
    • fix large shape >int32 by @mfoerste4 in #236
    • Fix a compile error by @magnatelee in #251
    • Fix the out-of-bounds bug in reshape by @magnatelee in #267
    • add missing comparison functions by @bryevdv in #278
    • Fix nonzero by @magnatelee in #285
    • fix return value of ndarray.argsort by @mfoerste4 in #286
    • Fix typos in tests after pytest transition by @manopapad in #309
    • Update trace.py tests / fix some warnings by @bryevdv in #307
    • Don't dump test stdout unconditionally by @bryevdv in #314
    • Add typing_extensions requirement to conda recipe by @marcinz in #325
    • Fix pytest exit to fail on errors by @marcinz in #334
    • Fixing #321 issue by @ipdemes in #341
    • Missing arguments in cases of eager-to-deferred fallback by @manopapad in #348
    • Add a missing instance of share=True by @manopapad in #350
    • Fix return types for some of the unary ops by @magnatelee in #354
    • fixing compile-time warnings by @ipdemes in #351
    • Remove special case handling for scalar arrays by @manopapad in #357
    • Fix the bug in np.append test on empty input array and non-empty scalars by @sbak5 in #365
    • Match NumPy's behavior for isclose(inf,inf) by @manopapad in #372
    • Fix unary reductions by @magnatelee in #369
    • Allow DeferredThunks to be created for empty arrays by @manopapad in #371
    • Fix documentation building by @manopapad in #377
    • Make the example programs pass the CI by @magnatelee in #380

    Documentation

    • Comparison table update by @ipdemes in #252
    • Add user-facing docs for coverage reporting by @bryevdv in #261
    • creating script for calculating API coverage by categories by @ipdemes in #271
    • Doc update by @magnatelee in #275
    • Fix docs builds for trace by @manopapad in #308
    • fixing documentation for fft by @ipdemes in #302
    • Add a custom autodoc class for ufuncs by @bryevdv in #317
    • Refactor comparison table as Sphinx extension by @bryevdv in #323
    • lgpatch docs + doc fixups by @bryevdv in #356

    New Contributors

    • @mferreravila made their first contribution in https://github.com/nv-legate/cunumeric/pull/238
    • @m3vaz made their first contribution in https://github.com/nv-legate/cunumeric/pull/330
    • @robinw0928 made their first contribution in https://github.com/nv-legate/cunumeric/pull/373

    Full Changelog: https://github.com/nv-legate/cunumeric/compare/v22.03.00...v22.05.00

    Source code(tar.gz)
    Source code(zip)
  • v22.03.00(Apr 5, 2022)

    Release 22.03 adds several new features, including np.repeat, np.unique, np.inner, np.outer, and 35 new universal functions (ufuncs). In this release, we also have significantly revised and refactored tensor operations to make them comprehensive. Preliminary support for 1D array sorting for multi-GPU execution is available. (CPU and OpenMP paths are still single processor only.) We have also made performance improvements for np.convolve and np.tril/trilu for GPU execution. Finally, we have added a tool that reports cuNumeric’s API coverage for a given NumPy program execution. (For the usage, please refer to “Measuring API coverage” in the cuNumeric documentation.)

    Conda packages for this release are available at https://anaconda.org/legate/cunumeric.

    New Features

    • Sort pr by @mfoerste4 in #199
    • Add basic cunumeric.patch module by @bryevdv in #225
    • adding support for REPEAT operation by @ipdemes in #190
    • np.unique implementation by @magnatelee in #192
    • np.append & ndarray.flatten by @sbak5 in #196
    • General cuFFT plan cache by @magnatelee in #195
    • Tools for checking API coverage by @magnatelee in #191
    • Overhaul linear algebra operations by @manopapad in #217

    Improvements

    • Move the ufunc module by @magnatelee in #234
    • ufunc refactoring + a bunch of missing ufuncs by @magnatelee in #223
    • Expand coverage reporting to ndarray methods by @bryevdv in #219
    • Einsum benchmark improvements by @manopapad in #222
    • Remove old-style casts by @manopapad in #218
    • Optimize np.tril used in Cholesky by @magnatelee in #214
    • Add a convergence threshold argument to the cg example by @marcinz in #221
    • Make sure nonzero produces outputs in C order by @magnatelee in #216
    • API cleanup for ndarray by @bryevdv in #209
    • Minor improvement for diag by @magnatelee in #211
    • Stop using alloca by @magnatelee in #212
    • Port and refactor GH #140 "Use cufft callbacks for better performance on fft-based convolutions" by @magnatelee in #204

    Bug Fixes

    • Activate the cuBLAS workaround by checking the cuBLAS version at runtime by @magnatelee in #245
    • Catch up the ufunc renaming by @magnatelee in #243
    • Fix coverage for ufuncs by @bryevdv in #240
    • Fix docs breakage by @bryevdv in #239
    • Fix compilation errors on clang by @manopapad in #233
    • Add cunumeric.ufunc to packages by @bryevdv in #231
    • Fix trailing comma tuple bug by @bryevdv in #230
    • Fix the build issue with Thrust by @magnatelee in #227
    • Fix some docs breakage by @bryevdv in #224
    • Fix for #208 by @magnatelee in #210
    • Fix for #206 by @magnatelee in #207
    • Fixed bugs for 1D array inputs on vstack , dstack and column_stack by @sbak5 in #182

    Documentation

    • Add docstrings to ndarray methods by @bryevdv in #205
    • Clean up Sphinx warnings by @bryevdv in #202
    • adding versions to the documentation by @ipdemes in #198
    • adding script for comparing API coverage + table at the documentation page by @ipdemes in #193
    • User facing documentation for API usage tool by @bryevdv in https://github.com/nv-legate/cunumeric/pull/262

    Full Changelog: https://github.com/nv-legate/cunumeric/compare/v22.01.00...v22.03.00

    Source code(tar.gz)
    Source code(zip)
  • v22.01.00(Feb 10, 2022)

    Release 22.01 adds support for einsum expressions, logic functions and a subset of indexing and array manipulation routines.

    Conda packages for this release are available at https://anaconda.org/legate/cunumeric.

    New Features

    • Convolution by @magnatelee and @lightsighter in #103
    • Added few universal functions and logical operations by @ipdemes in #134
    • numpy.tril and numpy.triu by @magnatelee in #144
    • Einsum operation by @manopapad in #142
    • Cholesky factorization by @magnatelee in #160
    • Implemented split routines and a test by @sbak5 in #152
    • Choose operation by @ipdemes in #146

    Improvements

    • Convolve Cache for cuFFT by @lightsighter in #109
    • Warmup iterations for Richardson-Lucy by @magnatelee in #113
    • Remove NumPyAllocation by @magnatelee in #118
    • Update for new data ingest interface by @manopapad in #105
    • Enable some temporarily commented-out tests by @manopapad in #119
    • Testcase for legate.core!94 by @manopapad in #120
    • Use built-in reduction op by @magnatelee in #136
    • Managing CUDA library contexts directly in cuNumeric by @magnatelee in #138
    • Support for cuSOLVER by @magnatelee in #139
    • Make CUDA library context cache thread safe by @magnatelee in #141
    • Use .cu for CUDA library management by @magnatelee in #145
    • Some reusable test input generators by @manopapad in #153
    • Fix Wundefined-var-template clang warning by @manopapad in #154
    • Add eager fallback mode to testing script by @manopapad in #156
    • Add eager tests by @marcinz in #157
    • Small additions to test input generators by @manopapad in #159
    • No longer need to reserve one dim for reductions by @manopapad in #161
    • Use a per-device stream cache for CUDA library calls by @magnatelee in #165
    • Simple tiling heuristic for Cholesky factorization by @magnatelee in #167
    • Fix clang-format config to include cu,cuh,inl files by @manopapad in #168
    • LEGATE_ABORT is now a statement by @magnatelee in #169
    • Preloading CUDA libraries by @magnatelee in #171
    • Use CHECK_* macros in a couple more places by @manopapad in #172
    • Fix some invocations of complex constructors by @manopapad in #173
    • Add a switch to not call tril on Cholesky outputs by @magnatelee in #174
    • Do python install on custom dir w/o eggs by @manopapad in #177
    • Refined 'tests/array_split.py' w/ more essential input shapes by @sbak5 in #178
    • WIP: adding logic for DIAGONAL by @ipdemes in #170
    • Stack and concatenate routines including subroutines by @sbak5 in #175
    • Refactoring by @magnatelee in #181

    Bug Fixes

    • Fix #111 by @magnatelee in #115
    • math.prod not available in python 3.7 by @manopapad in #129
    • Fix some compiler warnings by @magnatelee in #130
    • dot: fix error message on unsupported array dimensions by @manopapad in #133
    • Fix slot calculation in reduction kernel by @manopapad in #148
    • Port fix for #79 by @manopapad in #155
    • Build OpenBLAS with CROSS option to prevent tests at compile time by @marcinz in #158
    • Pin setuptools version, to work around breaking change by @manopapad in #164
    • Workaround for a bug in cuBLAS < 11.4 by @magnatelee in #185
    • Cannot install cuNumeric to different dir than Legate Core by @manopapad in #186
    • Adjust error tolerance for float16, to avoid spurious test failure by @manopapad in #166

    Documentation

    • Adding contributions file by @marcinz in #147
    • Update docstrings by @magnatelee in #188

    New Contributors

    • @lightsighter made their first contribution in https://github.com/nv-legate/cunumeric/pull/109
    • @ipdemes made their first contribution in https://github.com/nv-legate/cunumeric/pull/134
    • @pre-commit-ci made their first contribution in https://github.com/nv-legate/cunumeric/pull/151
    • @sbak5 made their first contribution in https://github.com/nv-legate/cunumeric/pull/152

    Full Changelog: https://github.com/nv-legate/cunumeric/compare/v21.11.00...v22.01.00

    Source code(tar.gz)
    Source code(zip)
  • v21.11.00(Nov 9, 2021)

    This is the initial public alpha release of cuNumeric, an aspiring drop-in replacement for NumPy at scale.

    Conda packages for this release are available at https://anaconda.org/legate/cunumeric.

    What's Changed

    • Refactoring for the broadcasting logic by @magnatelee in https://github.com/nv-legate/cunumeric/pull/18
    • Improved partitioning and sharding for GEMV by @manopapad in https://github.com/nv-legate/cunumeric/pull/37
    • Fix #16 by @manopapad in https://github.com/nv-legate/cunumeric/pull/38
    • Add CI by @marcinz in https://github.com/nv-legate/cunumeric/pull/43
    • Use a script on the runner to checkout CI repository by @marcinz in https://github.com/nv-legate/cunumeric/pull/44
    • Fix CI by @marcinz in https://github.com/nv-legate/cunumeric/pull/45
    • Extend tests with CPU/GPU/OMP testing by @marcinz in https://github.com/nv-legate/cunumeric/pull/48
    • Remove accidental part of the job matrix from CI by @marcinz in https://github.com/nv-legate/cunumeric/pull/49
    • Add missing alignment constraints for matrix-vector multiplication by @magnatelee in https://github.com/nv-legate/cunumeric/pull/58
    • Force left alignment for pointers and references by @magnatelee in https://github.com/nv-legate/cunumeric/pull/59
    • Don't alter the GC priority for external instances by @magnatelee in https://github.com/nv-legate/cunumeric/pull/60
    • Be strict when importing legate.numpy in examples by @manopapad in https://github.com/nv-legate/cunumeric/pull/61
    • Fix for reinterpret casts that are actually unsafe in the modern c++ by @magnatelee in https://github.com/nv-legate/cunumeric/pull/62
    • Remove the return type of the void-returning function in the mapper by @magnatelee in https://github.com/nv-legate/cunumeric/pull/63
    • Remove dependency on numpy>=1.20 by @manopapad in https://github.com/nv-legate/cunumeric/pull/64
    • Stop using looping templates by @magnatelee in https://github.com/nv-legate/cunumeric/pull/65
    • Bug fix for release mode by @magnatelee in https://github.com/nv-legate/cunumeric/pull/66
    • Port nozero to the new buffer API by @magnatelee in https://github.com/nv-legate/cunumeric/pull/68
    • Missing constraint for bincount by @magnatelee in https://github.com/nv-legate/cunumeric/pull/69
    • Clean up install script by @manopapad in https://github.com/nv-legate/cunumeric/pull/70
    • Fixes to compile on MacOS by @manopapad in https://github.com/nv-legate/cunumeric/pull/71
    • Disable absolute and allcose for complex types only with Clang by @magnatelee in https://github.com/nv-legate/cunumeric/pull/72
    • Generalize the reshape operator by @magnatelee in https://github.com/nv-legate/cunumeric/pull/73
    • Improve dot product for half precision floats by @magnatelee in https://github.com/nv-legate/cunumeric/pull/74
    • Support for tensordot by @magnatelee in https://github.com/nv-legate/cunumeric/pull/75
    • Bugfixes on operations by @manopapad in https://github.com/nv-legate/cunumeric/pull/76
    • Add missing type casts for __half by @magnatelee in https://github.com/nv-legate/cunumeric/pull/77
    • Pull the correct Core image by @marcinz in https://github.com/nv-legate/cunumeric/pull/78
    • Port remaining fixes from old branch by @manopapad in https://github.com/nv-legate/cunumeric/pull/80
    • Remove remaining conditional legate.numpy imports from examples by @manopapad in https://github.com/nv-legate/cunumeric/pull/81
    • Always dump test output by @marcinz in https://github.com/nv-legate/cunumeric/pull/83
    • Minor code cleanups by @manopapad in https://github.com/nv-legate/cunumeric/pull/85
    • Attempt to address #84 by @manopapad in https://github.com/nv-legate/cunumeric/pull/86
    • Always follow the core's choice regarding CUDA/OpenMP support by @manopapad in https://github.com/nv-legate/cunumeric/pull/88
    • Fix legate data interface by @magnatelee in https://github.com/nv-legate/cunumeric/pull/92
    • Handle overlapping stores correctly in dot by @magnatelee in https://github.com/nv-legate/cunumeric/pull/93
    • Improvements to handling of scalar arrays by @manopapad in https://github.com/nv-legate/cunumeric/pull/90
    • Port to the new calling convention by @magnatelee in https://github.com/nv-legate/cunumeric/pull/89
    • Prevent CI on forks by @marcinz in https://github.com/nv-legate/cunumeric/pull/94
    • Emptiness checks for matrix ops by @magnatelee in https://github.com/nv-legate/cunumeric/pull/95
    • Mapper update by @magnatelee in https://github.com/nv-legate/cunumeric/pull/82
    • Port to the new reduction op interface by @magnatelee in https://github.com/nv-legate/cunumeric/pull/96
    • Stop using delinearization by @magnatelee in https://github.com/nv-legate/cunumeric/pull/97
    • Dead code elimination by @magnatelee in https://github.com/nv-legate/cunumeric/pull/98
    • Reorganizing source files by @magnatelee in https://github.com/nv-legate/cunumeric/pull/99
    • Remove leftover requirements.txt by @manopapad in https://github.com/nv-legate/cunumeric/pull/100
    • Update for build system changes by @manopapad in https://github.com/nv-legate/cunumeric/pull/101
    • Updates for new attachment interface by @manopapad in https://github.com/nv-legate/cunumeric/pull/102
    • Fix for matrix-vector multiplication by @magnatelee in https://github.com/nv-legate/cunumeric/pull/104
    • Another attempt to fix degenerate cases by @magnatelee in https://github.com/nv-legate/cunumeric/pull/107
    • Fix #111 by @magnatelee in https://github.com/nv-legate/cunumeric/pull/116
    • Release 21.11.00 by @marcinz in https://github.com/nv-legate/cunumeric/pull/121

    New Contributors

    • @marcinz made their first contribution in https://github.com/nv-legate/cunumeric/pull/43

    Full Changelog: https://github.com/nv-legate/cunumeric/commits/v21.11.00

    Source code(tar.gz)
    Source code(zip)
Owner
Legate
High Productivity High Performance Computing
Legate
Sensitivity Analysis Library in Python (Numpy). Contains Sobol, Morris, Fractional Factorial and FAST methods.

Sensitivity Analysis Library (SALib) Python implementations of commonly used sensitivity analysis methods. Useful in systems modeling to calculate the

SALib 663 Jan 5, 2023
A Pythonic introduction to methods for scaling your data science and machine learning work to larger datasets and larger models, using the tools and APIs you know and love from the PyData stack (such as numpy, pandas, and scikit-learn).

This tutorial's purpose is to introduce Pythonistas to methods for scaling their data science and machine learning work to larger datasets and larger models, using the tools and APIs they know and love from the PyData stack (such as numpy, pandas, and scikit-learn).

Coiled 102 Nov 10, 2022
Drop-in replacement of Django admin comes with lots of goodies, fully extensible with plugin support, pretty UI based on Twitter Bootstrap.

Xadmin Drop-in replacement of Django admin comes with lots of goodies, fully extensible with plugin support, pretty UI based on Twitter Bootstrap. Liv

差沙 4.7k Dec 31, 2022
pdb++, a drop-in replacement for pdb (the Python debugger)

pdb++, a drop-in replacement for pdb What is it? This module is an extension of the pdb module of the standard library. It is meant to be fully compat

null 1k Dec 24, 2022
A drop-in replacement for Django's runserver.

About A drop in replacement for Django's built-in runserver command. Features include: An extendable interface for handling things such as real-time l

David Cramer 1.3k Dec 15, 2022
Python implementation of cover trees, near-drop-in replacement for scipy.spatial.kdtree

This is a Python implementation of cover trees, a data structure for finding nearest neighbors in a general metric space (e.g., a 3D box with periodic

Patrick Varilly 28 Nov 25, 2022
Drop-in Replacement of pychallonge

pychal Pychal is a drop-in replacement of pychallonge with some extra features and support for new Python versions. Pychal provides python bindings fo

ZED 29 Nov 28, 2022
A drop-in replacement for Django's runserver.

About A drop in replacement for Django's built-in runserver command. Features include: An extendable interface for handling things such as real-time l

David Cramer 1.3k Dec 15, 2022
A drop-in replacement for argparse that allows options to also be set via config files and/or environment variables.

ConfigArgParse Overview Applications with more than a handful of user-settable options are best configured through a combination of command line args,

null 634 Dec 22, 2022
pdb++, a drop-in replacement for pdb (the Python debugger)

pdb++, a drop-in replacement for pdb What is it? This module is an extension of the pdb module of the standard library. It is meant to be fully compat

null 1k Jan 2, 2023
Drop-in replacement of Django admin comes with lots of goodies, fully extensible with plugin support, pretty UI based on Twitter Bootstrap.

Xadmin Drop-in replacement of Django admin comes with lots of goodies, fully extensible with plugin support, pretty UI based on Twitter Bootstrap. Liv

差沙 4.7k Dec 31, 2022
A drop-in replacement for Django's runserver.

About A drop in replacement for Django's built-in runserver command. Features include: An extendable interface for handling things such as real-time l

David Cramer 1.3k Dec 15, 2022
A drop-in replacement for django's ImageField that provides a flexible, intuitive and easily-extensible interface for quickly creating new images from the one assigned to the field.

django-versatileimagefield A drop-in replacement for django's ImageField that provides a flexible, intuitive and easily-extensible interface for creat

Jonathan Ellenberger 490 Dec 13, 2022
A drop-in replacement for django's ImageField that provides a flexible, intuitive and easily-extensible interface for quickly creating new images from the one assigned to the field.

django-versatileimagefield A drop-in replacement for django's ImageField that provides a flexible, intuitive and easily-extensible interface for creat

Jonathan Ellenberger 490 Dec 13, 2022
Composable transformations of Python+NumPy programsComposable transformations of Python+NumPy programs

Chex Chex is a library of utilities for helping to write reliable JAX code. This includes utils to help: Instrument your code (e.g. assertions) Debug

DeepMind 506 Jan 8, 2023
MLP-Numpy - A simple modular implementation of Multi Layer Perceptron in pure Numpy.

MLP-Numpy A simple modular implementation of Multi Layer Perceptron in pure Numpy. I used the Iris dataset from scikit-learn library for the experimen

Soroush Omranpour 1 Jan 1, 2022
Calculates JMA (Japan Meteorological Agency) seismic intensity (shindo) scale from acceleration data recorded in NumPy array

shindo.py Calculates JMA (Japan Meteorological Agency) seismic intensity (shindo) scale from acceleration data stored in NumPy array Introduction Japa

RR_Inyo 3 Sep 23, 2022
Drag’n’drop Pivot Tables and Charts for Jupyter/IPython Notebook, care of PivotTable.js

pivottablejs: the Python module Drag’n’drop Pivot Tables and Charts for Jupyter/IPython Notebook, care of PivotTable.js Installation pip install pivot

Nicolas Kruchten 512 Dec 26, 2022
Drop-in MessagePack support for ASGI applications and frameworks

msgpack-asgi msgpack-asgi allows you to add automatic MessagePack content negotiation to ASGI applications (Starlette, FastAPI, Quart, etc.), with a s

Florimond Manca 128 Jan 2, 2023
Drag’n’drop Pivot Tables and Charts for Jupyter/IPython Notebook, care of PivotTable.js

pivottablejs: the Python module Drag’n’drop Pivot Tables and Charts for Jupyter/IPython Notebook, care of PivotTable.js Installation pip install pivot

Nicolas Kruchten 419 Feb 11, 2021