Massively parallel self-organizing maps: accelerate training on multicore CPUs, GPUs, and clusters

Overview

Somoclu

Somoclu is a massively parallel implementation of self-organizing maps. It exploits multicore CPUs, it is able to rely on MPI for distributing the workload in a cluster, and it can be accelerated by CUDA. A sparse kernel is also included, which is useful for training maps on vector spaces generated in text mining processes.

Key features:

  • Fast execution by parallelization: OpenMP, MPI, and CUDA are supported.
  • Multi-platform: Linux, macOS, and Windows are supported.
  • Planar and toroid maps.
  • Rectangular and hexagonal grids.
  • Gaussian and bubble neighborhood functions.
  • Both dense and sparse input data are supported.
  • Large maps of several hundred thousand neurons are feasible.
  • Integration with Databionic ESOM Tools.
  • Python, R, Julia, and MATLAB interfaces for the dense CPU and GPU kernels.

For more information, refer to the manuscript about the library [1].

Usage

Basic Command Line Use

Somoclu takes a plain text input file -- either dense or sparse data. Example files are included.

$ [mpirun -np NPROC] somoclu [OPTIONs] INPUT_FILE OUTPUT_PREFIX

Arguments:

-c FILENAME              Specify an initial codebook for the map.
-d NUMBER                Coefficient in the Gaussian neighborhood function
                         exp(-||x-y||^2/(2*(coeff*radius)^2)) (default: 0.5)
-e NUMBER                Maximum number of epochs
-g TYPE                  Grid type: square or hexagonal (default: square)
-h, --help               This help text
-k NUMBER                Kernel type
                            0: Dense CPU
                            1: Dense GPU
                            2: Sparse CPU
-l NUMBER                Starting learning rate (default: 0.1)
-L NUMBER                Finishing learning rate (default: 0.01)
-m TYPE                  Map type: planar or toroid (default: planar)
-n FUNCTION              Neighborhood function (bubble or gaussian, default: gaussian)
-p NUMBER                Compact support for Gaussian neighborhood
                         (0: false, 1: true, default: 0)
-r NUMBER                Start radius (default: half of the map in direction min(x,y))
-R NUMBER                End radius (default: 1)
-s NUMBER                Save interim files (default: 0):
                            0: Do not save interim files
                            1: Save U-matrix only
                            2: Also save codebook and best matching
-t STRATEGY              Radius cooling strategy: linear or exponential (default: linear)
-T STRATEGY              Learning rate cooling strategy: linear or exponential (default: linear)
-v NUMBER                Verbosity level, 0-2 (default: 0)
-x, --columns NUMBER     Number of columns in map (size of SOM in direction x)
-y, --rows    NUMBER     Number of rows in map (size of SOM in direction y)

Examples:

$ somoclu data/rgbs.txt data/rgbs
$ mpirun -np 4 somoclu -k 0 --rows 20 --columns 20 data/rgbs.txt data/rgbs

With random initialization, the initial codebook will be filled with random numbers ranging from 0 to 1. Either supply your own initial codebook or normalize your data to fall in this range.

If the range of the values of the features includes negative numbers, the codebook will eventually adjust. It is, however, not advised to have negative values, especially if the codebook is initialized from 0 to 1. This comes from the batch training nature of the parallel implementation. The batch update rule will change the codebook values with weighted averages of the data points, and with negative values, the updates can cancel out.

The maps generated by the GPU and the CPU kernels are likely to be different. For computational efficiency, Somoclu uses single-precision floats. This occasionally results in identical distances between a data instance and the neurons. The CPU version will pick the best matching unit with the lowest coordinate values. Such sequentiality cannot be guaranteed in the reduction kernel of the GPU variant. This is not a bug, but it is better to be aware of it.

Efficient Parallel and Distributed Execution

The CPU kernels use OpenMP to load multicore processors. On a single node, this is more efficient than launching tasks with MPI to match the number of cores. The MPI tasks replicated the codebook, which is especially inefficient for large maps.

For instance, given a single node with eight cores, the following execution will use 1/8th of the memory, and will run 10-20% faster:

$ somoclu -x 200 -y 200 data/rgbs.txt data/rgbs

Or, equivalently:

$ OMP_NUM_THREADS=8 somoclu -x 200 -y 200 data/rgbs.txt data/rgbs

Avoid the following on a single node:

$ OMP_NUM_THREADS=1 mpirun -np 8 somoclu -x 200 -y 200 data/rgbs.txt data/rgbs

The same caveats apply for the sparse CPU kernel.

Visualisation

The primary purpose of generating a map is visualisation. Apart from the Python interface, Somoclu does not come with its own functions for visualisation, since there are numerous generic tools that are capable of plotting high-quality figures. The R version integrates with kohonen and the MATLAB version with somtoolbox.

The output formats U-matrix and the codebook of the command-line version are compatible with Databionic ESOM Tools for more advanced visualisation.

Input File Formats

One sparse and two dense data formats are supported. All of them are plain text files. The entries can be separated by any white-space character. One row represents one data instance across all formats. Comment lines starting with a hash mark are ignored.

The sparse format follows the libsvm guidelines. The first feature is zero-indexed. For instance, the vector [ 1.2 0 0 3.4] is represented as the following line in the file: 0:1.2 3:3.4. The file is parsed twice: once to get the number of instances and features, and the second time to read the data in the individual threads.

The basic dense format includes the coordinates of the data vectors, separated by a white-space. Just like the sparse format, this file is parsed twice to get the basic dimensions right.

The .lrn file of Databionic ESOM Tools is also accepted and it is parsed only once. The format is described as follows:

% n

% m

% s1 s2 .. sm

% var_name1 var_name2 .. var_namem

x11 x12 .. x1m

x21 x22 .. x2m

. . . .

. . . .

xn1 xn2 .. xnm

Here n is the number of rows in the file, that is, the number of data instances. Parameter m defines the number of columns in the file. The next row defines the column mask: the value 1 for a column means the column should be used in the training. Note that the first column in this format is always a unique key, so this should have the value 9 in the column mask. The row with the variable names is ignore by Somoclu. The elements of the matrix follow -- from here, the file is identical to the basic dense format, with the addition of the first column as the unique key.

If the input file is sparse, but a dense kernel is invoked, Somoclu will execute and results will be incorrect. Invoking a sparse kernel on a dense input file is likely to lead to a segmentation fault.

Interfaces

Python, Julia, R, and MATLAB interfaces are available for the dense CPU and GPU kernels. MPI and the sparse kernel are not support through the interfaces. For respective examples, see the folders in src.

The Python version is also available in PyPI. You can install it with

$ pip install somoclu

Alternatively, it is also available on conda-forge:

$ conda install somoclu

Some pre-built binaries in the wheel format or Windows installer are provided at PyPI Dowloads, they are tested with Anaconda distributions. If you encounter errors like ImportError: DLL load failed: The specified module could not be found when import somoclu, you may need to use Dependency Walker as shown here on _somoclu_wrap.pyd to find out missing DLLs and place them at the write place. Usually right version (32/64bit) of vcomp90.dll, msvcp90.dll, msvcr90.dll should be put to C:\Windows\System32 or C:\Windows\SysWOW64.

The wheel binaries for macOS are compiled with the system clang++, which means by default it is not parallelized. To use the parallel version on Mac, you can either use the version in conda-forge or compile it from source with your favourite OpenMP-friendly compiler. To get it working with the GPU kernel, you might have to follow the instructions at the Somoclu - Python Interface.

The R version is available on CRAN. You can install it with

install.packages("Rsomoclu")

To get it working with the GPU kernel, download the source zip file and specify your CUDA directory the following way:

R CMD INSTALL src/Rsomoclu_version.tar.gz --configure-args=/path/to/cuda

The Julia version is available on GitHub. The standard Pkg.add("Somoclu") should work.

For using the MATLAB toolbox, install SOM-Toolbox following the instructions at ilarinieminen/SOM-Toolbox and define the location of your MATLAB install to the configure script:

./configure --without-mpi --with-matlab=/usr/local/MATLAB/R2014a

For the GPU kernel, specify the location of your CUDA library for the configure script. More detailed instructions are in the MATLAB source folder.

Compilation & Installation

These are the instructions for compiling the core library and the command line interface. The only dependency is a C++ compiler chain -- GCC, ICC, clang, and VC were tested.

Multicore execution is supported through OpenMP -- the compiler must support this. Distributed systems are supported through MPI. The package was tested with OpenMPI. It should also work with other MPI flavours. CUDA support is optional.

Linux or macOS

If you have just cloned the git repository first run

$ ./autogen.sh

Then follow the standard POSIX procedure:

$ ./configure [options]
$ make
$ make install

Options for configure

--prefix=PATH           Set directory prefix for installation

By default Somoclu is installed into /usr/local. If you prefer a different location, use this option to select an installation directory.

--without-mpi           Disregard any MPI installation found.
--with-mpi=MPIROOT      Use MPI root directory.
--with-mpi-compilers=DIR or --with-mpi-compilers=yes
                          use MPI compiler (mpicxx) found in directory DIR, or
                          in your PATH if =yes
--with-mpi-libs="LIBS"  MPI libraries [default "-lmpi"]
--with-mpi-incdir=DIR   MPI include directory [default MPIROOT/include]
--with-mpi-libdir=DIR   MPI library directory [default MPIROOT/lib]

The above flags allow the identification of the correct MPI library the user wishes to use. The flags are especially useful if MPI is installed in a non-standard location, or when multiple MPI libraries are available.

--with-cuda=/path/to/cuda           Set path for CUDA

Somoclu looks for CUDA in /usr/local/cuda. If your installation is not there, then specify the path with this parameter. If you do not want CUDA enabled, set the parameter to --without-cuda.

Windows

Use the somoclu.sln under src/Windows/somoclu as an example Visual Studio 2015 solution. Modify the CUDA version or VC compiler version according to your needs.

The default solution enables all of OpenMP, MPI, and CUDA. The default MPI installation path is C:\Program Files (x86)\Microsoft SDKs\MPI\, modify the settings if yours is in a different path. The configuration default CUDA version is 9.1. Disable MPI by removing HAVE_MPI macro in the project properties (Properties -> Configuration Properties -> C/C++ -> Preprocessor). Disable CUDA by removing CUDA macro in the solution properties and uncheck CUDA in Project -> Custom Build Rules. If you open the solution without CUDA installed, please remove the following sections in somoclu.vcxproj:

  <ImportGroup Label="ExtensionSettings">
    <Import Project="$(VCTargetsPath)\BuildCustomizations\CUDA 9.1.props" />
  </ImportGroup>

and

  <ImportGroup Label="ExtensionTargets">
    <Import Project="$(VCTargetsPath)\BuildCustomizations\CUDA 9.1.targets" />
  </ImportGroup>

or change the version number according to which you installed.

The usage is identical to the Linux version through command line (see the relevant section).

Acknowledgment

This work was supported by the European Commission Seventh Framework Programme under Grant Agreement Number FP7-601138 PERICLES and by the AWS in Education Machine Learning Grant award.

Citation

  1. Peter Wittek, Shi Chao Gao, Ik Soo Lim, Li Zhao (2017). Somoclu: An Efficient Parallel Library for Self-Organizing Maps. Journal of Statistical Software, 78(9), pp.1--21. DOI:10.18637/jss.v078.i09. arXiv:1305.1422.
Comments
  • wrap_train

    wrap_train

    Hello.

    While using somoclu (windows7, python3.4) and calling the som.train() command (with and without args), I get the following error: som.train(epochs=epochs, radius0=radius0, scale0=scale0) File "C:\Python34\lib\site-packages\somoclu\train.py", line 158, in train wrap_train(np.ravel(self._data), epochs, self._n_columns, self._n_rows, NameError: name 'wrap_train' is not defined Any ideas?

    Thank you!

    bug 
    opened by dinos66 67
  • Activation on the surface of the map

    Activation on the surface of the map

    I would like to obtain the activation of every SOM unit for a given input, i.e., not just the MAU/BMU.

    I see from the comment here https://github.com/peterwittek/somoclu/issues/39#issuecomment-235973972 by @ghost that obtaining the BMUs is relatively simple. This makes me assume/infer that the activation across the whole map is calculated by the codebook times the input - is this correct?

    For example:

    def get_surface_state(som, X):
        D = np.dot(som.codebook.reshape((som.codebook.shape[0] * som.codebook.shape[1], som.codebook.shape[2])), X.T).T
        return D
    

    If yes, can D be used as the input to another SOM? Or would that be meaningless, in your opinion?

    enhancement question 
    opened by oliviaguest 46
  • Cannot import without error -- libiomp no longer in homebrew, etc.

    Cannot import without error -- libiomp no longer in homebrew, etc.

    In [2]: import somoclu Warning: training function cannot be imported. Only pre-trained maps can be analyzed. If you installed Somoclu with pip on OS X, this typically means missing libiomp. Please refer to the documentation and to this issue: https://github.com/peterwittek/somoclu/issues/28

    It seems this issue was fixed, but I am still running into it. Any help?

    opened by kevglynn 24
  • NameError: global name 'wrap_train' is not defined

    NameError: global name 'wrap_train' is not defined

    I installed the whole package according to the following file: https://somoclu.readthedocs.io/en/stable/download.html

    I am running a sample example from the following. https://somoclu.readthedocs.io/en/stable/example.html

    I am on ubuntu 16 and python 2.7

    I am getting the following error

    NameError: global name 'wrap_train' is not defined

    This is the following code where I am getting the error: som.train(data)

    Help me with this error.

    duplicate question 
    opened by bharadwaj509 18
  • Differing results between R and CLI

    Differing results between R and CLI

    I had been using the CLI version of Somoclu and getting results consistent with other implementations of batch-trained SOMs (R-Kohonen and Matlab Neural network Toolbox). When you responded to my request for the inclusion of a bubble neighborhood function I decided to use the R package to test it rather than recompile for CLI (which was initially done for me by a colleague.) Following your instructions I compiled and tested the new version of the R package. I found that I was getting much higher quantitization errors than in CLI. In order to determine whether the difference was due to R or the requested changes I installed the current, unmodified, R package and compared the same input file using the same initial codebook. Using CLI I got a quantitization error of 5.73 but with R the quantitization error was 18.95.

    Here is the CLI command:

    somoclu -c T7_init_weights_nospace_CRend.wts -e 100 -k 1 -m planar -t linear -r 9 -R 1 -T linear -l 1 -L 0.01 -s 0 -x 18 -y 15 t7_norow_somoclu T_Opt_6

    Here is the R script: dataTemp <- data.frame(fread("t7_norow_somoclu")) dataSource <- as.matrix(dataTemp) initTemp <- data.frame(fread("T7_init_weights_nospace_CRend.wts")) initSource <- as.matrix(initTemp) nSomX <- 18 nSomY <- 15 nEpoch <- 100 radius0 <- 9 radiusN <- 1 radiusCooling <- "linear" scale0 <- 1 scaleN <- 0.01 scaleCooling <- "linear" kernelType <- 0 mapType <- "planar" gridType <- "rectangular" compactSupport <- FALSE codebook <- initSource res <- Rsomoclu.train(dataSource, nEpoch, nSomX, nSomY, radius0, radiusN, radiusCooling, scale0, scaleN, scaleCooling, kernelType, mapType, gridType, compactSupport, codebook) head(res$globalBmus)

    bug 
    opened by brogie62 14
  • Neighborhood Function Selection

    Neighborhood Function Selection

    Could you allow for user selection of neighborhood function beyond Gaussian? I would like the option of bubble but there are other functions that different users may prefer.

    Thanks for the great work!

    enhancement 
    opened by brogie62 14
  • Visual Studio 2015 - project load failed

    Visual Studio 2015 - project load failed

    Project load fails when trying to open src\Windows\somoclu\somoclu.sln (1.6.1) in Visual Studio Community 2015 Update 3 on Windows 10. The error message in Solution Explorer is,: "The project requires user input. Reload the project for more information." What am I missing?

    enhancement 
    opened by pegasone 13
  • Batch training vs online training

    Batch training vs online training

    Is there any way to update the SOM after a single pattern is a presented? I tried to send a pattern set with only a single member but I get the following error because presumably it needs more than a single pattern:

      File "/somoclu/train.py", line 224, in update_data
        self.n_vectors, self.n_dim = data.shape
    ValueError: need more than 1 value to unpack
    
    

    Is there an easy way around this I am missing? Shall I just edit the function to allow a single pattern or will that break other things?

    question 
    opened by oliviaguest 12
  • some make problems in fedora20 - incl a way to get it to work, somehow

    some make problems in fedora20 - incl a way to get it to work, somehow

    hi, in order to make it was necessary to edit io.cpp and add "#include " because its dependency in iostream has been removed with gcc 4.3. further more i comment the line with setDevice because of the error "undefined reference to `setDevice'", which allowed me to make it. BUT now i have problems with CUDA. if i try the gpu kernel, i get following error: $somoclu -x 100 -y 200 file folder -e 20 -k 1 --> nVectors: 417 nVectorsPerRank: 417 nDimensions: 0 Epoch: 0 Radius: 50 ** On entry to SGEMM parameter number 8 had an illegal value !!!! kernel execution error. Aborted terminate called after throwing an instance of 'thrust::system::system_error' what(): unload of CUDA runtime failed Aborted (core dumped)

    would you have any suggestions? thanks a lot!

    opened by standfest 12
  • BMU inconsistencies (Python)

    BMU inconsistencies (Python)

    [Using GPU, hexagonal lattice, toroid config]

    I'm not sure what causes it but there appear to be some BMU inconsistencies when comparing the BMUs returned by training and the BMUs you compute yourself using the codebook and the data.

    To reproduce:

    import numpy as np
    import somoclu
    import matplotlib.pyplot as plt
    
    SAMPLES = 50000
    DIMS = 21
    
    train_data = np.random.uniform(low=0.0, high=1.0, size=SAMPLES*DIMS).reshape((SAMPLES,DIMS)).astype(np.float32)
    
    som = somoclu.Somoclu(
        32, 32,
        data=train_data,
        maptype="toroid",
        gridtype="hexagonal",
        kerneltype=0
    )
    
    som.train(epochs=32, radius0=min(som._n_rows, som._n_columns)/2, radiusN=1, scale0=0.1, scaleN=0.01)
    
    W = som.codebook.reshape((som.codebook.shape[0] * som.codebook.shape[1], som.codebook.shape[2]))
    X = train_data
    
    D = -2*np.dot(W, X.T) + (W**2).sum(1)[:, None] + (X**2).sum(1)[:, None].T
    BMU = (D==D.min(0)[None,:]).astype("float32").T
    NBMU =  BMU.reshape((X.shape[0], som.codebook.shape[0], som.codebook.shape[1]))
    new_bmus = np.vstack(NBMU.nonzero()[1:][::-1]).T
    
    hitmap = BMU.sum(axis=0).reshape((som.codebook.shape[0], som.codebook.shape[1])).T
    hitmap2 = np.zeros((som.umatrix.shape[0], som.umatrix.shape[1]))
    for x in range(0, som.bmus.shape[0]):
        hitmap2[som.bmus[x][0], som.bmus[x][1]] += 1
    
    print np.absolute(new_bmus - som.bmus).sum()
    
    plt.imshow(hitmap - hitmap2)
    plt.show()
    
    bug 
    opened by ghost 11
  • Preprocessor Macro usage should be limited

    Preprocessor Macro usage should be limited

    The code has a large number of pre-processor macros that may not be necessary and makes the code confusing.

    Guarding pragma within #defines for omp may not be needed as unrecognized pragmas are ignored by compiler.

    opened by sambitdash 10
  • Batch mode and learning rate

    Batch mode and learning rate

    If somoclu always uses the batch training mode, how is the learning rate used? If the update is done according to the batch training equation given in Wittek et al, 2017 (Somoclu: An Efficient Parallel Library for Self-Organizing Maps), learning rate is not used at all.

    opened by jtorppa 3
  • Numpy requirement in setup.py

    Numpy requirement in setup.py

    I'm trying to use somoclu in a project. When I make a PR of my project to a github repo, it failed the codecov test and the readthedoc build because to install somoclu I need numpy in the first place (as there is a line import numpy in the setup.py file of somoclu)

    This is not a problem of somoclu itself, but that codecov and Read the Docs do not have numpy. I'm wondering if any small modification could be made to handle this problem? Thanks in advance!

    opened by yanzastro 3
  • Get bmu of testing data

    Get bmu of testing data

    I can use som.bmus to get the coordinates of bmus of the training data, but I want to calculate the coordinates of bmus of testing data from a pre-trained SOM. I can define something like:

    def f(i, data):
        dmap = np.sum(((data[i] - som.codebook)**2)**0.5, axis=2)
        return np.asarray(np.unravel_index(np.argmin(dmap), dmap.shape, order='F'))
    
    def get_test_bmus(som, data):    
        with Pool() as p:
            bmus = p.map(lambda i: f(i, data), np.arange(len(data)))  # use multiprocess since the testing data can be very large
        return np.asarray(bmus)
    

    but I'm wondering if there is built-in method in a somoclu class that can do such job?

    opened by yanzastro 2
  • Attempting to use an MPI routine before initializing MPI

    Attempting to use an MPI routine before initializing MPI

    Hi guys,

    I'm struggling in my attempt to build from source code in a windows environment and I did almost everything "right", but when I tried to run a example, this error appeared:

    Attempting to use an MPI routine before initializing MPI

    It is just one line of sadness and desolation, lol The version of somoclu was the 1.7.5, python 3.8.6, conda env, CUDA 10.2, Win10, MPI version from microsoft and VS2019.

    In my work PC with ubuntu, every thing just worked fine with the conda install script.

    Considering that I am a big noob in building complex stuff from source, I probably did something wrong and didn't even realize it..

    Could anyone please help me on this one?

    opened by joaoponte 4
  •    (core dumped)

    (core dumped)

    I received core dumped error. My data size is 382776x174688. I submit a job in cluster high performance compauter using the scrips mpirun -np 8 somoclu -g hexagonal -m toroid --rows 22 --columns 17 psl_n.txt psl_DJF

    *** Error in `somoclu': munmap_chunk(): invalid pointer: 0x0000000001807310 *** ======= Backtrace: ========= /lib64/libc.so.6(+0x7ada4)[0x2b1d72d3cda4] somoclu[0x437528] somoclu[0x4070b3] /lib64/libc.so.6(__libc_start_main+0xf5)[0x2b1d72ce3b35] somoclu[0x40751c] ======= Memory map: ======== 00400000-005e8000 r-xp 00000000 00:2a 113822987310 /home/bcheneka/Build_WRF/LIBRARIES/somoclu/src/somoclu 007e8000-007e9000 r--p 001e8000 00:2a 113822987310 /home/bcheneka/Build_WRF/LIBRARIES/somoclu/src/somoclu 007e9000-007ea000 rw-p 001e9000 00:2a 113822987310 /home/bcheneka/Build_WRF/LIBRARIES/somoclu/src/somoclu 007ea000-007eb000 rw-p 00000000 00:00 0 01783000-01b25000 rw-p 00000000 00:00 0 [heap] 2b1d6ef41000-2b1d6ef61000 r-xp 00000000 00:24 201029694 /usr/lib64/ld-2.17.so 2b1d6ef61000-2b1d6ef63000 rw-p 00000000 00:00 0 2b1d6ef9b000-2b1d6efa3000 rw-p 00000000 00:00 0 2b1d6f160000-2b1d6f161000 r--p 0001f000 00:24 201029694 /usr/lib64/ld-2.17.so 2b1d6f161000-2b1d6f162000 rw-p 00020000 00:24 201029694 /usr/lib64/ld-2.17.so 2b1d6f162000-2b1d6f163000 rw-p 00000000 00:00 0 2b1d6f163000-2b1d6f165000 r-xp 00000000 00:24 201299306 /usr/lib64/libdl-2.17.so 2b1d6f165000-2b1d6f365000 ---p 00002000 00:24 201299306 /usr/lib64/libdl-2.17.so 2b1d6f365000-2b1d6f366000 r--p 00002000 00:24 201299306 /usr/lib64/libdl-2.17.so 2b1d6f366000-2b1d6f367000 rw-p 00003000 00:24 201299306 /usr/lib64/libdl-2.17.so 2b1d6f367000-2b1d6f3c9000 r-xp 00000000 00:2b 340587319 /opt/ud/cuda-8.0/lib64/libcudart.so.8.0.44 2b1d6f3c9000-2b1d6f5c9000 ---p 00062000 00:2b 340587319 /opt/ud/cuda-8.0/lib64/libcudart.so.8.0.44 2b1d6f5c9000-2b1d6f5cc000 rw-p 00062000 00:2b 340587319 /opt/ud/cuda-8.0/lib64/libcudart.so.8.0.44 2b1d6f5cc000-2b1d6f5cd000 rw-p 00000000 00:00 0 2b1d6f5cd000-2b1d71d52000 r-xp 00000000 00:2b 340587313 /opt/ud/cuda-8.0/lib64/libcublas.so.8.0.45 2b1d71d52000-2b1d71f51000 ---p 02785000 00:2b 340587313 /opt/ud/cuda-8.0/lib64/libcublas.so.8.0.45 2b1d71f51000-2b1d71f6f000 rw-p 02784000 00:2b 340587313 /opt/ud/cuda-8.0/lib64/libcublas.so.8.0.45 2b1d71f6f000-2b1d71f7d000 rw-p 00000000 00:00 0 2b1d71f7d000-2b1d72146000 r-xp 00000000 00:2a 49401166130 /home/bcheneka/gcc-9.2.0/lib64/libstdc++.so.6.0.27 2b1d72146000-2b1d72345000 ---p 001c9000 00:2a 49401166130 /home/bcheneka/gcc-9.2.0/lib64/libstdc++.so.6.0.27 2b1d72345000-2b1d72350000 r--p 001c8000 00:2a 49401166130 /home/bcheneka/gcc-9.2.0/lib64/libstdc++.so.6.0.27 2b1d72350000-2b1d72353000 rw-p 001d3000 00:2a 49401166130 /home/bcheneka/gcc-9.2.0/lib64/libstdc++.so.6.0.27 2b1d72353000-2b1d72356000 rw-p 00000000 00:00 0 2b1d72356000-2b1d72456000 r-xp 00000000 00:24 201413623 /usr/lib64/libm-2.17.so 2b1d72456000-2b1d72656000 ---p 00100000 00:24 201413623 /usr/lib64/libm-2.17.so 2b1d72656000-2b1d72657000 r--p 00100000 00:24 201413623 /usr/lib64/libm-2.17.so 2b1d72657000-2b1d72658000 rw-p 00101000 00:24 201413623 /usr/lib64/libm-2.17.so 2b1d72658000-2b1d7268c000 r-xp 00000000 00:2a 49393205273 /home/bcheneka/gcc-9.2.0/lib64/libgomp.so.1.0.0 2b1d7268c000-2b1d7288c000 ---p 00034000 00:2a 49393205273 /home/bcheneka/gcc-9.2.0/lib64/libgomp.so.1.0.0 2b1d7288c000-2b1d7288d000 r--p 00034000 00:2a 49393205273 /home/bcheneka/gcc-9.2.0/lib64/libgomp.so.1.0.0 2b1d7288d000-2b1d7288e000 rw-p 00035000 00:2a 49393205273 /home/bcheneka/gcc-9.2.0/lib64/libgomp.so.1.0.0 2b1d7288e000-2b1d728a5000 r-xp 00000000 00:2a 49401166125 /home/bcheneka/gcc-9.2.0/lib64/libgcc_s.so.1 2b1d728a5000-2b1d72aa4000 ---p 00017000 00:2a 49401166125 /home/bcheneka/gcc-9.2.0/lib64/libgcc_s.so.1 2b1d72aa4000-2b1d72aa5000 r--p 00016000 00:2a 49401166125 /home/bcheneka/gcc-9.2.0/lib64/libgcc_s.so.1 2b1d72aa5000-2b1d72aa6000 rw-p 00017000 00:2a 49401166125 /home/bcheneka/gcc-9.2.0/lib64/libgcc_s.so.1 2b1d72aa6000-2b1d72abd000 r-xp 00000000 00:24 201413908 /usr/lib64/libpthread-2.17.so 2b1d72abd000-2b1d72cbc000 ---p 00017000 00:24 201413908 /usr/lib64/libpthread-2.17.so 2b1d72cbc000-2b1d72cbd000 r--p 00016000 00:24 201413908 /usr/lib64/libpthread-2.17.so 2b1d72cbd000-2b1d72cbe000 rw-p 00017000 00:24 201413908 /usr/lib64/libpthread-2.17.so 2b1d72cbe000-2b1d72cc2000 rw-p 00000000 00:00 0 2b1d72cc2000-2b1d72e78000 r-xp 00000000 00:24 201299203 /usr/lib64/libc-2.17.so 2b1d72e78000-2b1d73078000 ---p 001b6000 00:24 201299203 /usr/lib64/libc-2.17.so 2b1d73078000-2b1d7307c000 r--p 001b6000 00:24 201299203 /usr/lib64/libc-2.17.so 2b1d7307c000-2b1d7307e000 rw-p 001ba000 00:24 201299203 /usr/lib64/libc-2.17.so 2b1d7307e000-2b1d73083000 rw-p 00000000 00:00 0 2b1d73083000-2b1d7308a000 r-xp 00000000 00:24 201427121 /usr/lib64/librt-2.17.so 2b1d7308a000-2b1d73289000 ---p 00007000 00:24 201427121 /usr/lib64/librt-2.17.so 2b1d73289000-2b1d7328a000 r--p 00006000 00:24 201427121 /usr/lib64/librt-2.17.so 2b1d7328a000-2b1d7328b000 rw-p 00007000 00:24 201427121 /usr/lib64/librt-2.17.so 7fff63ab4000-7fff63ad6000 rw-p 00000000 00:00 0 [stack] 7fff63bd2000-7fff63bd4000 r-xp 00000000 00:00 0 [vdso] ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall] /var/opt/ud/torque-4.2.10/mom_priv/jobs/285503.hpc12.hpc.SC: line 28: 331675 Aborted (core dumped) MP_NUM_THREADS=8 somoclu -g hexagonal -m toroid --rows 22 --columns 17 psl_n.txt psl_DJF

    opened by bedassa 3
Releases(1.7.6)
DistML is a Ray extension library to support large-scale distributed ML training on heterogeneous multi-node multi-GPU clusters

DistML is a Ray extension library to support large-scale distributed ML training on heterogeneous multi-node multi-GPU clusters

null 27 Aug 19, 2022
ThunderGBM: Fast GBDTs and Random Forests on GPUs

Documentations | Installation | Parameters | Python (scikit-learn) interface What's new? ThunderGBM won 2019 Best Paper Award from IEEE Transactions o

Xtra Computing Group 648 Dec 16, 2022
TensorFlowOnSpark brings TensorFlow programs to Apache Spark clusters.

TensorFlowOnSpark TensorFlowOnSpark brings scalable deep learning to Apache Hadoop and Apache Spark clusters. By combining salient features from the T

Yahoo 3.8k Jan 4, 2023
Predicting Baseball Metric Clusters: Clustering Application in Python Using scikit-learn

Clustering Clustering Application in Python Using scikit-learn This repository contains the prediction of baseball metric clusters using MLB Statcast

Tom Weichle 2 Apr 18, 2022
Interactive Parallel Computing in Python

Interactive Parallel Computing with IPython ipyparallel is the new home of IPython.parallel. ipyparallel is a Python package and collection of CLI scr

IPython 2.3k Dec 30, 2022
monolish: MONOlithic Liner equation Solvers for Highly-parallel architecture

monolish is a linear equation solver library that monolithically fuses variable data type, matrix structures, matrix data format, vendor specific data transfer APIs, and vendor specific numerical algebra libraries.

RICOS Co. Ltd. 179 Dec 21, 2022
Uber Open Source 1.6k Dec 31, 2022
Mixing up the Invariant Information clustering architecture, with self supervised concepts from SimCLR and MoCo approaches

Self Supervised clusterer Combined IIC, and Moco architectures, with some SimCLR notions, to get state of the art unsupervised clustering while retain

Bendidi Ihab 9 Feb 13, 2022
Self Organising Map (SOM) for clustering of atomistic samples through unsupervised learning.

Self Organising Map for Clustering of Atomistic Samples - V2 Description Self Organising Map (also known as Kohonen Network) implemented in Python for

Franco Aquistapace 0 Nov 16, 2021
Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.

Horovod Horovod is a distributed deep learning training framework for TensorFlow, Keras, PyTorch, and Apache MXNet. The goal of Horovod is to make dis

Horovod 12.9k Jan 7, 2023
DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.

DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective. 10x Larger Models 10x Faster Trainin

Microsoft 8.4k Dec 30, 2022
PyTorch extensions for high performance and large scale training.

Description FairScale is a PyTorch extension library for high performance and large scale training on one or multiple machines/nodes. This library ext

Facebook Research 2k Dec 28, 2022
A high performance and generic framework for distributed DNN training

BytePS BytePS is a high performance and general distributed training framework. It supports TensorFlow, Keras, PyTorch, and MXNet, and can run on eith

Bytedance Inc. 3.3k Dec 28, 2022
TensorFlow Decision Forests (TF-DF) is a collection of state-of-the-art algorithms for the training, serving and interpretation of Decision Forest models.

TensorFlow Decision Forests (TF-DF) is a collection of state-of-the-art algorithms for the training, serving and interpretation of Decision Forest models. The library is a collection of Keras models and supports classification, regression and ranking. TF-DF is a TensorFlow wrapper around the Yggdrasil Decision Forests C++ libraries. Models trained with TF-DF are compatible with Yggdrasil Decision Forests' models, and vice versa.

null 538 Jan 1, 2023
MosaicML Composer contains a library of methods, and ways to compose them together for more efficient ML training

MosaicML Composer MosaicML Composer contains a library of methods, and ways to compose them together for more efficient ML training. We aim to ease th

MosaicML 2.8k Jan 6, 2023
SageMaker Python SDK is an open source library for training and deploying machine learning models on Amazon SageMaker.

SageMaker Python SDK SageMaker Python SDK is an open source library for training and deploying machine learning models on Amazon SageMaker. With the S

Amazon Web Services 1.8k Jan 1, 2023
WAGMA-SGD is a decentralized asynchronous SGD for distributed deep learning training based on model averaging.

WAGMA-SGD is a decentralized asynchronous SGD based on wait-avoiding group model averaging. The synchronization is relaxed by making the collectives externally-triggerable, namely, a collective can be initiated without requiring that all the processes enter it. It partially reduces the data within non-overlapping groups of process, improving the parallel scalability.

Shigang Li 6 Jun 18, 2022
A collection of interactive machine-learning experiments: 🏋️models training + 🎨models demo

?? Interactive Machine Learning experiments: ??️models training + ??models demo

Oleksii Trekhleb 1.4k Jan 6, 2023
Model factory is a ML training platform to help engineers to build ML models at scale

Model Factory Machine learning today is powering many businesses today, e.g., search engine, e-commerce, news or feed recommendation. Training high qu

null 16 Sep 23, 2022