Extremely simple and fast extreme multi-class and multi-label classifiers.

Marek Wydmuch

Last update: Nov 14, 2022

Related tags

Deep Learning python machine-learning classification hsm datasets multi-label-classification plt multi-class-classification extreme-classification probabilistic-label-trees xmlc label-tree-classifiers

Overview

napkinXC

napkinXC is an extremely simple and fast library for extreme multi-class and multi-label classification, that focus of implementing various methods for Probabilistic Label Trees. It allows training a classifier for very large datasets in just a few lines of code with minimal resources.

Right now, napkinXC implements the following features both in Python and C++:

Probabilistic Label Trees (PLTs) and Hierarchical softmax (HSM),
different type of inference methods (top-k, above given threshold, etc.),
fast prediction with labels weight, e.g., propensity scores,
efficient online F-measure optimization (OFO) procedure,
different tree building methods, including hierarchical k-means clustering method,
training of tree node
support for custom tree structures, and node weights,
helpers to download and load data from XML Repository,
helpers to measure performance (precision@k, recall@k, nDCG@k, propensity-scored precision@k, and more).

Please note that this library is still under development and also serves as a base for experiments. API may not be compatible between releases and some of the experimental features may not be documented. Do not hesitate to open an issue in case of a question or problem!

The napkinXC is distributed under MIT license. All contributions to the project are welcome!

Python Quick Start and Documentation

Python (3.5+) version of napkinXC can be easily installed from PyPy repository on Linux and MacOS, it requires modern C++17 compiler, CMake and Git installed:

pip install napkinxc

or the latest master version directly from the GitHub repository (not recommended):

pip install git+https://github.com/mwydmuch/napkinXC.git

Minimal example of usage:

from napkinxc.datasets import load_dataset
from napkinxc.models import PLT
from napkinxc.measures import precision_at_k

X_train, Y_train = load_dataset("eurlex-4k", "train")
X_test, Y_test = load_dataset("eurlex-4k", "test")
plt = PLT("eurlex-model")
plt.fit(X_train, Y_train)
Y_pred = plt.predict(X_test, top_k=1)
print(precision_at_k(Y_test, Y_pred, k=1))

More examples can be found under python/examples directory. napkinXC's documentation is available at https://napkinxc.readthedocs.io.

Executable

napkinXC can also be used as executable to train and evaluate models using a data in LIBSVM format. See documentation for more details.

References and acknowledgments

This library implements methods from following papers (see experiments directory for scripts to replicate the results):

Another implementation of PLT model is available in extremeText library, that implements approach described in this NeurIPS paper.

Comments

OOM/SegFault issues?
PLTs train extremely quickly using this implementation which is fantastic to see. However, I have run into a few issues when training on larger datasets:

There is no batching method by default, which then requires very large matrices to be held in matrices in order to train the model. I assume the way to avoid this in memory issue is FitOnFile

Even if the training data fits comfortably in memory, at larger sizes such as >1 million training data points with >10k labels, the Python kernel crashes which I assume is due to an OOM or s error on the C++ side. It feels like there must be a memory leak somewhere, as the actual trees themselves never get that large, and I assume internally that the model trains in batches as outlined in the paper

bug
opened by ASharmaML 3

feature dimension mismatch between train and test data

Hi,

There seems to be a bug in the data loading process.

For example:

from napkinxc.datasets import load_dataset
trn_X, _ = load_dataset('wiki10-31k', 'train')
tst_X, _ = load_dataset('wiki10-31k', 'test')
print('# of features of training data', trn_X.shape[1])
print('# of features of test data', tst_X.shape[1])

gives:

# of features of training data 101938
# of features of test data 101937

Cheers, Han

opened by xiaohan2012 3

C++ compilation error when building

I'm trying to install the latest version using pip install git+https://github.com/mwydmuch/napkinXC.git, which gives the following compilation error:

  /tmp/pip-req-build-hszug3a8/src/liblinear/linear.cpp: In function ‘void solve_l2r_lr_dual(const problem*, float*, float, float, float, int)’:
  /tmp/pip-req-build-hszug3a8/src/liblinear/linear.cpp:1335:29: error: no matching function for call to ‘max(float&, double)’
      Gmax = max(Gmax, fabs(gp));

Thanks for your time.

PS:

The full error message is:

  ERROR: Command errored out with exit status 1:
   command: /home/cloud-user/code/diverse-xml/.venv/bin/python3.8 /home/cloud-user/code/diverse-xml/.venv/lib/python3.8/site-packages/pip/_vendor/pep517/in_process/_in_process.py build_wheel /tmp/tmphqbldjxt
       cwd: /tmp/pip-req-build-hszug3a8
  Complete output (149 lines):
  running bdist_wheel
  running build
  running build_py
  -- downloading/updating pybind11
  -- pybind11 directory found, pulling...
  From https://github.com/pybind/pybind11
   * branch            master     -> FETCH_HEAD
  --
  fatal: A branch named 'tag_v2.6.2' already exists.
  CMake Warning at GitUtils.cmake:251 (message):
    pybind11 some error happens.
  Call Stack (most recent call first):
    CMakeLists.txt:92 (git_clone)


  -- pybind11 v2.9.0 dev1
  -- Configuring done
  -- Generating done
  -- Build files have been written to: /tmp/pip-req-build-hszug3a8/build
  [  3%] Building C object python/napkinxc/_napkinxc/CMakeFiles/pynxc.dir/__/__/__/src/liblinear/blas/axpy.c.o
  [  7%] Building CXX object python/napkinxc/_napkinxc/CMakeFiles/pynxc.dir/__/__/__/src/args.cpp.o
  [ 10%] Building CXX object python/napkinxc/_napkinxc/CMakeFiles/pynxc.dir/__/__/__/src/base.cpp.o
  [ 14%] Building C object python/napkinxc/_napkinxc/CMakeFiles/pynxc.dir/__/__/__/src/liblinear/blas/dot.c.o
  [ 17%] Building C object python/napkinxc/_napkinxc/CMakeFiles/pynxc.dir/__/__/__/src/liblinear/blas/nrm2.c.o
  [ 21%] Building C object python/napkinxc/_napkinxc/CMakeFiles/pynxc.dir/__/__/__/src/liblinear/blas/scal.c.o
  [ 25%] Building CXX object python/napkinxc/_napkinxc/CMakeFiles/pynxc.dir/__/__/__/src/liblinear/linear.cpp.o
  [ 28%] Building CXX object python/napkinxc/_napkinxc/CMakeFiles/pynxc.dir/__/__/__/src/liblinear/tron.cpp.o
  [ 32%] Building CXX object python/napkinxc/_napkinxc/CMakeFiles/pynxc.dir/__/__/__/src/log.cpp.o
  [ 35%] Building CXX object python/napkinxc/_napkinxc/CMakeFiles/pynxc.dir/__/__/__/src/main.cpp.o
  [ 39%] Building CXX object python/napkinxc/_napkinxc/CMakeFiles/pynxc.dir/__/__/__/src/measure.cpp.o
  [ 42%] Building CXX object python/napkinxc/_napkinxc/CMakeFiles/pynxc.dir/__/__/__/src/misc.cpp.o
  /tmp/pip-req-build-hszug3a8/src/liblinear/linear.cpp: In function ‘void solve_l2r_lr_dual(const problem*, float*, float, float, float, int)’:
  /tmp/pip-req-build-hszug3a8/src/liblinear/linear.cpp:1335:29: error: no matching function for call to ‘max(float&, double)’
      Gmax = max(Gmax, fabs(gp));
                               ^
  /tmp/pip-req-build-hszug3a8/src/liblinear/linear.cpp:16:36: note: candidate: template<class T> T max(T, T)
   template <class T> static inline T max(T x,T y) { return (x>y)?x:y; }
                                      ^
  /tmp/pip-req-build-hszug3a8/src/liblinear/linear.cpp:16:36: note:   template argument deduction/substitution failed:
  /tmp/pip-req-build-hszug3a8/src/liblinear/linear.cpp:1335:29: note:   deduced conflicting types for parameter ‘T’ (‘float’ and ‘double’)
      Gmax = max(Gmax, fabs(gp));
                               ^
  /tmp/pip-req-build-hszug3a8/src/liblinear/linear.cpp: In function ‘float calc_max_p(const problem*, const parameter*)’:
  /tmp/pip-req-build-hszug3a8/src/liblinear/linear.cpp:2363:38: error: no matching function for call to ‘max(float&, double)’
     max_p = max(max_p, fabs(prob->y[i]));
                                        ^
  /tmp/pip-req-build-hszug3a8/src/liblinear/linear.cpp:16:36: note: candidate: template<class T> T max(T, T)
   template <class T> static inline T max(T x,T y) { return (x>y)?x:y; }
                                      ^
  /tmp/pip-req-build-hszug3a8/src/liblinear/linear.cpp:16:36: note:   template argument deduction/substitution failed:
  /tmp/pip-req-build-hszug3a8/src/liblinear/linear.cpp:2363:38: note:   deduced conflicting types for parameter ‘T’ (‘float’ and ‘double’)
     max_p = max(max_p, fabs(prob->y[i]));
                                        ^
  /tmp/pip-req-build-hszug3a8/src/liblinear/linear.cpp: In function ‘model* load_model(const char*)’:
  /tmp/pip-req-build-hszug3a8/src/liblinear/linear.cpp:3022:35: warning: format ‘%lf’ expects argument of type ‘double*’, but argument 3 has type ‘float*’ [-Wformat=]
    if (fscanf(_stream, _format, _var) != 1)\
                                     ^
  /tmp/pip-req-build-hszug3a8/src/liblinear/linear.cpp:3098:4: note: in expansion of macro ‘FSCANF’
      FSCANF(fp,"%lf",&bias);
      ^
  /tmp/pip-req-build-hszug3a8/src/liblinear/linear.cpp:3022:35: warning: format ‘%lf’ expects argument of type ‘double*’, but argument 3 has type ‘float*’ [-Wformat=]
    if (fscanf(_stream, _format, _var) != 1)\
                                     ^
  /tmp/pip-req-build-hszug3a8/src/liblinear/linear.cpp:3136:4: note: in expansion of macro ‘FSCANF’
      FSCANF(fp, "%lf ", &model_->w[i*nr_w+j]);
      ^
  [ 46%] Building CXX object python/napkinxc/_napkinxc/CMakeFiles/pynxc.dir/__/__/__/src/model.cpp.o
  python/napkinxc/_napkinxc/CMakeFiles/pynxc.dir/build.make:159: recipe for target 'python/napkinxc/_napkinxc/CMakeFiles/pynxc.dir/__/__/__/src/liblinear/linear.cpp.o' failed
  make[2]: *** [python/napkinxc/_napkinxc/CMakeFiles/pynxc.dir/__/__/__/src/liblinear/linear.cpp.o] Error 1
  make[2]: *** Waiting for unfinished jobs....
  In file included from /tmp/pip-req-build-hszug3a8/src/base.h:34:0,
                   from /tmp/pip-req-build-hszug3a8/src/base.cpp:27:
  /tmp/pip-req-build-hszug3a8/src/vector.h:216:25: warning: inline function ‘virtual Real AbstractVector::at(int) const’ used but never defined
       virtual inline Real at(int index) const = 0;
                           ^
  /tmp/pip-req-build-hszug3a8/src/vector.h:217:26: warning: inline function ‘virtual Real& AbstractVector::operator[](int)’ used but never defined
       virtual inline Real& operator[](int index) = 0;
                            ^
  In file included from /tmp/pip-req-build-hszug3a8/src/misc.h:35:0,
                   from /tmp/pip-req-build-hszug3a8/src/misc.cpp:30:
  /tmp/pip-req-build-hszug3a8/src/matrix.h: In instantiation of ‘void RMatrix<T>::appendRow(const U&, bool) [with U = std::vector<IVPair<float> >; T = SparseVector]’:
  /tmp/pip-req-build-hszug3a8/src/misc.cpp:96:35:   required from here
  /tmp/pip-req-build-hszug3a8/src/matrix.h:38:44: error: invalid initialization of non-const reference of type ‘SparseVector&’ from an rvalue of type ‘void’
           T& row = r.emplace_back(vec, sorted);
                                              ^
  python/napkinxc/_napkinxc/CMakeFiles/pynxc.dir/build.make:229: recipe for target 'python/napkinxc/_napkinxc/CMakeFiles/pynxc.dir/__/__/__/src/misc.cpp.o' failed
  make[2]: *** [python/napkinxc/_napkinxc/CMakeFiles/pynxc.dir/__/__/__/src/misc.cpp.o] Error 1
  CMakeFiles/Makefile2:145: recipe for target 'python/napkinxc/_napkinxc/CMakeFiles/pynxc.dir/all' failed
  make[1]: *** [python/napkinxc/_napkinxc/CMakeFiles/pynxc.dir/all] Error 2
  Makefile:135: recipe for target 'all' failed

opened by xiaohan2012 2

segmentation fault (possibly in kmeans)

Hi,

The following snippet gives segmentation fault on my machine:

from napkinxc.models import PLT
from napkinxc.datasets import load_dataset

trn_X, trn_Y = load_dataset('eurlex-4k', "train", verbose=1)
model = PLT('output/test', tree_type='hierarchicalKmeans',
            arity=32,
            seed=25,
            threads=4, verbose=1)
model.fit(trn_X, trn_Y)

The output is:

napkinXC 0.5.1 - train
  Model: output/test
    Type: plt
  Base models optimizer: liblinear
    Solver: L2R_LR_DUAL, eps: 0.1, cost: 10, max iter: 100, weights threshold: 0.1
  Tree type: hierarchicalKmeans, arity: 32, k-means eps: 0.0001, balanced: 1, weighted features: 0, max leaves: 100
  Threads: 4, memory limit: ~29G
  Seed: 25
Building tree ...
Computing labels' features matrix in 4 threads ...
Hierarchical K-Means clustering in 4 threads ...
[2]    20767 segmentation fault (core dumped)  python mwe.py

Would it be possible to fix this?

Other info:

napkinXC 0.5.1
Python 3.8.3
Ubuntu 16.04.7 LTS

bug

opened by xiaohan2012 2

Support for custom tree (tree_structure in python interface)

Hi,

Thanks for writing this software, which is very useful!

I'm currently experimenting with the effect of label trees and wish to load trees from file.

Is it possible to pass a string to thetree_structure parameter in models.PLT class, so that a custom tree can be loaded? It seems like the current Python interface does not support it.

If possible, I can make a pull request, and it would be nice if some instructions can be given, e.g., where and what to modify.

Cheers, Han
question

opened by xiaohan2012 2
a possible bug during kmeans initialization
Hi,

During kmeans initialization, the randomly generated row index can produce segfault.

https://github.com/mwydmuch/napkinXC/blob/6783031fc3ffe769092dc63292bb7f8857b73967/src/models/kmeans.cpp#L54

https://github.com/mwydmuch/napkinXC/blob/6783031fc3ffe769092dc63292bb7f8857b73967/src/models/kmeans.cpp#L56 I suppose it should be:

std::uniform_int_distribution<int> dist(0, points - 1);

Regards, Han
bug
opened by xiaohan2012 1
string "amazontitles-3M" to "amazontitles-3m" in datasets.py
Hi,

the following code is giving an error:

from napkinxc.datasets import load_dataset _ = load_dataset('AmazonTitles-3M', 'train')

ValueError: Dataset AmazonTitles-3M is not available

It should be easy to fix: just change the string amazontitles-3M to amazontitles-3m in the file python/napkinxc/datasets.py
bug
opened by xiaohan2012 1
Seg Fault Fix

Hello! While trying to do OFO using the python interface I was running into some segmentation faults happening at random times while doing Macro OFO. I went into the C++ code and I think I found the culprit, which was a pre-increment counter being used instead of the counter itself, which would result in the code trying to access memory outside the array's limits. When I tried running it again after making this change, I stopped getting seg faults and the OFO score actually improved. Disclaimer: I am quite rusty with C++ so please double check.

opened by atriantafybbc 1
predict_proba was not returning probabilities

The model.predict_proba was not returning probabilities as it was using the internal _model.predict() function instead of the _model.predict_proba() function.

opened by atriantafybbc 1
pip install napkinxc failed
I've been trying to install napkinxc via pip and have repeatedly run into the below error. Any idea what could be wrong? Thanks.

(base) atl436user1:~ user.name$ pip install napkinxc Collecting napkinxc Using cached napkinxc-0.4.0.tar.gz (142 kB) Installing build dependencies ... done Getting requirements to build wheel ... done Preparing wheel metadata ... done Requirement already satisfied: sklearn in ./anaconda3/lib/python3.7/site-packages/sklearn-0.0-py3.7.egg (from napkinxc) (0.0) Requirement already satisfied: scipy in ./anaconda3/lib/python3.7/site-packages (from napkinxc) (1.3.1) Requirement already satisfied: numpy in ./anaconda3/lib/python3.7/site-packages (from napkinxc) (1.18.1) Requirement already satisfied: scikit-learn in ./anaconda3/lib/python3.7/site-packages (from sklearn->napkinxc) (0.22.1) Requirement already satisfied: joblib>=0.11 in ./anaconda3/lib/python3.7/site-packages (from scikit-learn->sklearn->napkinxc) (0.14.1) Building wheels for collected packages: napkinxc Building wheel for napkinxc (PEP 517) ... error ERROR: Command errored out with exit status 1: command: /Users/alec.delany/anaconda3/bin/python /Users/user.name/anaconda3/lib/python3.7/site-packages/pip/_vendor/pep517/_in_process.py build_wheel /var/folders/p9/mx3qb0ms6_gf8jx7tq82xr14s6lmfx/T/tmpogk81v8q cwd: /private/var/folders/p9/mx3qb0ms6_gf8jx7tq82xr14s6lmfx/T/pip-install-4bhkxsrq/napkinxc Complete output (106 lines): running bdist_wheel running build running build_py -- downloading/updating pybind11 -- pybind11 directory found, pulling... From https://github.com/pybind/pybind11

branch master -> FETCH_HEAD -- Already on 'master' -- Your branch is up to date with 'origin/master'.

-- pybind11 v2.6.0 -- Configuring done -- Generating done -- Build files have been written to: /private/var/folders/p9/mx3qb0ms6_gf8jx7tq82xr14s6lmfx/T/pip-install-4bhkxsrq/napkinxc/build Scanning dependencies of target pynxc [ 6%] Building CXX object python/napkinxc/_napkinxc/CMakeFiles/pynxc.dir////src/base.cpp.o [ 9%] Building CXX object python/napkinxc/_napkinxc/CMakeFiles/pynxc.dir////src/args.cpp.o [ 15%] Building C object python/napkinxc/_napkinxc/CMakeFiles/pynxc.dir////src/blas/daxpy.c.o [ 15%] Building C object python/napkinxc/_napkinxc/CMakeFiles/pynxc.dir////src/blas/dnrm2.c.o [ 15%] Building C object python/napkinxc/_napkinxc/CMakeFiles/pynxc.dir////src/blas/ddot.c.o [ 21%] Building C object python/napkinxc/_napkinxc/CMakeFiles/pynxc.dir////src/blas/dscal.c.o [ 21%] Building CXX object python/napkinxc/_napkinxc/CMakeFiles/pynxc.dir////src/data_reader.cpp.o [ 31%] Building CXX object python/napkinxc/_napkinxc/CMakeFiles/pynxc.dir////src/data_readers/vw_reader.cpp.o [ 31%] Building CXX object python/napkinxc/_napkinxc/CMakeFiles/pynxc.dir////src/data_readers/libsvm_reader.cpp.o [ 31%] Building CXX object python/napkinxc/_napkinxc/CMakeFiles/pynxc.dir////src/liblinear/linear.cpp.o [ 34%] Building CXX object python/napkinxc/_napkinxc/CMakeFiles/pynxc.dir////src/liblinear/tron.cpp.o [ 37%] Building CXX object python/napkinxc/_napkinxc/CMakeFiles/pynxc.dir////src/log.cpp.o [ 40%] Building CXX object python/napkinxc/_napkinxc/CMakeFiles/pynxc.dir////src/main.cpp.o [ 43%] Building CXX object python/napkinxc/_napkinxc/CMakeFiles/pynxc.dir////src/measure.cpp.o [ 46%] Building CXX object python/napkinxc/_napkinxc/CMakeFiles/pynxc.dir////src/misc.cpp.o [ 50%] Building CXX object python/napkinxc/_napkinxc/CMakeFiles/pynxc.dir////src/model.cpp.o [ 53%] Building CXX object python/napkinxc/_napkinxc/CMakeFiles/pynxc.dir////src/models/br.cpp.o [ 56%] Building CXX object python/napkinxc/_napkinxc/CMakeFiles/pynxc.dir////src/models/extreme_text.cpp.o [ 59%] Building CXX object python/napkinxc/_napkinxc/CMakeFiles/pynxc.dir////src/models/hsm.cpp.o In file included from /private/var/folders/p9/mx3qb0ms6_gf8jx7tq82xr14s6lmfx/T/pip-install-4bhkxsrq/napkinxc/src/model.cpp:36: /private/var/folders/p9/mx3qb0ms6_gf8jx7tq82xr14s6lmfx/T/pip-install-4bhkxsrq/napkinxc/src/models/online_plt.h:46:10: error: 'shared_timed_mutex' is unavailable: introduced in macOS 10.12 std::shared_timed_mutex treeMtx; ^ /Library/Developer/CommandLineTools/usr/bin/../include/c++/v1/shared_mutex:205:58: note: 'shared_timed_mutex' has been explicitly marked unavailable here class _LIBCPP_TYPE_VIS _LIBCPP_AVAILABILITY_SHARED_MUTEX shared_timed_mutex ^ [ 62%] Building CXX object python/napkinxc/_napkinxc/CMakeFiles/pynxc.dir////src/models/kmeans.cpp.o 1 error generated. make[2]: *** [python/napkinxc/_napkinxc/CMakeFiles/pynxc.dir///__/src/model.cpp.o] Error 1 make[2]: *** Waiting for unfinished jobs.... make[1]: *** [python/napkinxc/_napkinxc/CMakeFiles/pynxc.dir/all] Error 2 make: *** [all] Error 2

[cmake] configuring CMake project...

running build_py (cmake)

[cmake] building CMake project -> build

Traceback (most recent call last): File "/Users/user.name/anaconda3/lib/python3.7/site-packages/pip/_vendor/pep517/_in_process.py", line 280, in main() File "/Users/user.name/anaconda3/lib/python3.7/site-packages/pip/_vendor/pep517/_in_process.py", line 263, in main json_out['return_val'] = hook(**hook_input['kwargs']) File "/Users/user.name/anaconda3/lib/python3.7/site-packages/pip/_vendor/pep517/_in_process.py", line 205, in build_wheel metadata_directory) File "/private/var/folders/p9/mx3qb0ms6_gf8jx7tq82xr14s6lmfx/T/pip-build-env-evty4816/overlay/lib/python3.7/site-packages/setuptools/build_meta.py", line 217, in build_wheel wheel_directory, config_settings) File "/private/var/folders/p9/mx3qb0ms6_gf8jx7tq82xr14s6lmfx/T/pip-build-env-evty4816/overlay/lib/python3.7/site-packages/setuptools/build_meta.py", line 202, in _build_with_temp_dir self.run_setup() File "/private/var/folders/p9/mx3qb0ms6_gf8jx7tq82xr14s6lmfx/T/pip-build-env-evty4816/overlay/lib/python3.7/site-packages/setuptools/build_meta.py", line 254, in run_setup self).run_setup(setup_script=setup_script) File "/private/var/folders/p9/mx3qb0ms6_gf8jx7tq82xr14s6lmfx/T/pip-build-env-evty4816/overlay/lib/python3.7/site-packages/setuptools/build_meta.py", line 145, in run_setup exec(compile(code, file, 'exec'), locals()) File "setup.py", line 64, in include_package_data=True, File "/private/var/folders/p9/mx3qb0ms6_gf8jx7tq82xr14s6lmfx/T/pip-build-env-evty4816/overlay/lib/python3.7/site-packages/cmaketools/init.py", line 98, in setup _setup(**setup_args) File "/private/var/folders/p9/mx3qb0ms6_gf8jx7tq82xr14s6lmfx/T/pip-build-env-evty4816/overlay/lib/python3.7/site-packages/setuptools/init.py", line 153, in setup return distutils.core.setup(**attrs) File "/Users/user.name/anaconda3/lib/python3.7/distutils/core.py", line 148, in setup dist.run_commands() File "/Users/user.name/anaconda3/lib/python3.7/distutils/dist.py", line 966, in run_commands self.run_command(cmd) File "/Users/user.name/anaconda3/lib/python3.7/distutils/dist.py", line 985, in run_command cmd_obj.run() File "/private/var/folders/p9/mx3qb0ms6_gf8jx7tq82xr14s6lmfx/T/pip-build-env-evty4816/overlay/lib/python3.7/site-packages/wheel/bdist_wheel.py", line 290, in run self.run_command('build') File "/Users/user.name/anaconda3/lib/python3.7/distutils/cmd.py", line 313, in run_command self.distribution.run_command(command) File "/Users/user.name/anaconda3/lib/python3.7/distutils/dist.py", line 985, in run_command cmd_obj.run() File "/Users/user.name/anaconda3/lib/python3.7/distutils/command/build.py", line 135, in run self.run_command(cmd_name) File "/Users/user.name/anaconda3/lib/python3.7/distutils/cmd.py", line 313, in run_command self.distribution.run_command(command) File "/Users/user.name/anaconda3/lib/python3.7/distutils/dist.py", line 985, in run_command cmd_obj.run() File "/private/var/folders/p9/mx3qb0ms6_gf8jx7tq82xr14s6lmfx/T/pip-build-env-evty4816/overlay/lib/python3.7/site-packages/cmaketools/cmakecommands.py", line 110, in run self._run_cmake() File "/private/var/folders/p9/mx3qb0ms6_gf8jx7tq82xr14s6lmfx/T/pip-build-env-evty4816/overlay/lib/python3.7/site-packages/cmaketools/cmakecommands.py", line 104, in _run_cmake pkg_version=self.distribution.get_version(), File "/private/var/folders/p9/mx3qb0ms6_gf8jx7tq82xr14s6lmfx/T/pip-build-env-evty4816/overlay/lib/python3.7/site-packages/cmaketools/cmakebuilder.py", line 349, in run env=env, File "/private/var/folders/p9/mx3qb0ms6_gf8jx7tq82xr14s6lmfx/T/pip-build-env-evty4816/overlay/lib/python3.7/site-packages/cmaketools/cmakeutil.py", line 169, in build return sp.run(args, env=env).check_returncode() File "/Users/user.name/anaconda3/lib/python3.7/subprocess.py", line 422, in check_returncode self.stderr) subprocess.CalledProcessError: Command '['cmake', '--build', 'build', '-j', '7', '--config', 'Release']' returned non-zero exit status 2.

ERROR: Failed building wheel for napkinxc Failed to build napkinxc ERROR: Could not build wheels for napkinxc which use PEP 517 and cannot be installed directly

Running python 3.7.1 on MacOS Catalina version 10.15.6.
opened by adelany3 1
build failed

ub16hp@UB16HP:~/ub16_prj/napkinXML$ make [ 6%] Building CXX object CMakeFiles/nxml.dir/src/main.cpp.o In file included from /home/ub16hp/ub16_prj/napkinXML/src/main.cpp:8:0: /home/ub16hp/ub16_prj/napkinXML/src/base.h: In member function ‘double Base::predictLoss(U*)’: /home/ub16hp/ub16_prj/napkinXML/src/base.h:65:25: error: ‘pow’ is not a member of ‘std’ if(hingeLoss) val = std::pow(fmax(0, 1 - val), 2); // Hinge squared loss ^ /home/ub16hp/ub16_prj/napkinXML/src/base.h:65:49: error: there are no arguments to ‘fmax’ that depend on a template parameter, so a declaration of ‘fmax’ must be available [-fpermissive] if(hingeLoss) val = std::pow(fmax(0, 1 - val), 2); // Hinge squared loss ^ /home/ub16hp/ub16_prj/napkinXML/src/base.h:65:49: note: (if you use ‘-fpermissive’, G++ will accept your code, but allowing the use of an undeclared name is deprecated) /home/ub16hp/ub16_prj/napkinXML/src/base.h:66:32: error: there are no arguments to ‘exp’ that depend on a template parameter, so a declaration of ‘exp’ must be available [-fpermissive] else val = log(1 + exp(-val)); // Log loss ^ /home/ub16hp/ub16_prj/napkinXML/src/base.h: In member function ‘double Base::predictProbability(U*)’: /home/ub16hp/ub16_prj/napkinXML/src/base.h:74:50: error: there are no arguments to ‘exp’ that depend on a template parameter, so a declaration of ‘exp’ must be available [-fpermissive] if(hingeLoss) val = 1.0 / (1.0 + exp(-2 * val)); // Probability for squared Hinge loss solver ^ /home/ub16hp/ub16_prj/napkinXML/src/base.h:75:37: error: there are no arguments to ‘exp’ that depend on a template parameter, so a declaration of ‘exp’ must be available [-fpermissive] else val = 1.0 / (1.0 + exp(-val)); // Probability ^ CMakeFiles/nxml.dir/build.make:86: recipe for target 'CMakeFiles/nxml.dir/src/main.cpp.o' failed make[2]: *** [CMakeFiles/nxml.dir/src/main.cpp.o] Error 1 CMakeFiles/Makefile2:67: recipe for target 'CMakeFiles/nxml.dir/all' failed make[1]: *** [CMakeFiles/nxml.dir/all] Error 2 Makefile:83: recipe for target 'all' failed make: *** [all] Error 2

opened by loveJasmine 1
pickling models
Hi,

I'm trying to use napkinXC under ray, which relies pickle for data serialization.

It seems that napkinXC models cannot be pickled.

For instance,

import pickle from napkinxc.models import PLT model = PLT('/tmp/something/') pickle.dump(model, open('/tmp/some-pickle.pkl', 'wb'))

gives:

TypeError: cannot pickle 'napkinxc._napkinxc.CPPModel' object.

Is there any workaround or any plan to support pickling for this issue?
opened by xiaohan2012 1

Releases(0.6.2)

0.6.2(Oct 17, 2022)

Source code(tar.gz)
Source code(zip)
0.6.1(Sep 3, 2022)

Source code(tar.gz)
Source code(zip)
0.6.0(Mar 22, 2022)
Fix a lot of bugs present in the older Python version,

add the option to load weights in the sparse, map, or dense format,

add Windows compatibility.

Source code(tar.gz)
Source code(zip)
0.5.2(May 17, 2021)
Add more datasets from XML repo,

fix some bugs in OVR and BR training.

Source code(tar.gz)
Source code(zip)
0.5.1(Feb 6, 2021)
Fix ndcg_at_k method,

add new propensity scored measures: psrecall_at_k, psdcg_at_k, psndcg_at_k,

improve performance of load_libsvm method.

Source code(tar.gz)
Source code(zip)
0.5.0(Feb 2, 2021)
Add prediction with weights,

add ofo method to Python bindings,

measures at k now return an array of all values from 1 to k,

fix many minor bugs.

Source code(tar.gz)
Source code(zip)
0.4.2(Nov 4, 2020)
Fix some bugs.

Source code(tar.gz)
Source code(zip)
0.4.1(Oct 26, 2020)
Add load and unload methods to Python models,

fix some bugs.

Source code(tar.gz)
Source code(zip)
0.4.0(Sep 7, 2020)

First napkinXC release with Python bindings
Source code(tar.gz)
Source code(zip)

Owner

Marek Wydmuch

Ph.D. student, machine learning and 3D graphics enthusiast

GitHub

DECAF: Deep Extreme Classification with Label Features

DECAF DECAF: Deep Extreme Classification with Label Features @InProceedings{Mittal21, author = "Mittal, A. and Dahiya, K. and Agrawal, S. and Sain

46 Nov 6, 2022

Label Mask for Multi-label Classification

LM-MLC 一种基于完型填空的多标签分类算法 1 前言本文主要介绍本人在全球人工智能技术创新大赛【赛道一】设计的一种基于完型填空(模板)的多标签分类算法：LM-MLC，该算法拟合能力很强能感知标签关联性，在多个数据集上测试表明该算法与主流算法无显著性差异，在该比赛数据集上的dev效果很好，但是由

52 Nov 20, 2022

WarpDrive: Extremely Fast End-to-End Deep Multi-Agent Reinforcement Learning on a GPU

WarpDrive is a flexible, lightweight, and easy-to-use open-source reinforcement learning (RL) framework that implements end-to-end multi-agent RL on a single GPU (Graphics Processing Unit).

334 Jan 6, 2023

Official implementation for the paper: "Multi-label Classification with Partial Annotations using Class-aware Selective Loss"

Multi-label Classification with Partial Annotations using Class-aware Selective Loss Paper | Pretrained models Official PyTorch Implementation Emanuel

99 Dec 27, 2022

Pre-trained Deep Learning models and demos (high quality and extremely fast)

OpenVINO™ Toolkit - Open Model Zoo repository This repository includes optimized deep learning models and a set of demos to expedite development of hi

3.4k Dec 31, 2022

Official implementation of "Open-set Label Noise Can Improve Robustness Against Inherent Label Noise" (NeurIPS 2021)

Open-set Label Noise Can Improve Robustness Against Inherent Label Noise NeurIPS 2021: This repository is the official implementation of ODNL. Require

12 Dec 7, 2022

A PyTorch implementation of ICLR 2022 Oral paper PiCO: Contrastive Label Disambiguation for Partial Label Learning

PiCO: Contrastive Label Disambiguation for Partial Label Learning This is a PyTorch implementation of ICLR 2022 Oral paper PiCO; also see our Project

83 May 11, 2022

Python Tensorflow 2 scripts for detecting objects of any class in an image without knowing their label.

Tensorflow-Mobile-Generic-Object-Localizer Python Tensorflow 2 scripts for detecting objects of any class in an image without knowing their label. Ori

11 Nov 15, 2022

Python TFLite scripts for detecting objects of any class in an image without knowing their label.

42 Oct 7, 2022

Extremely easy multi instancing software for minecraft speedrunning.

Easy Multi Extremely easy multi/single instancing software for minecraft speedrunning. A couple of goals of this project: Setup multi in minutes No fi

8 Jul 16, 2022

An extremely simple, intuitive, hardware-friendly, and well-performing network structure for LiDAR semantic segmentation on 2D range image. IROS21

FIDNet_SemanticKITTI Motivation Implementing complicated network modules with only one or two points improvement on hardware is tedious. So here we pr

54 Dec 12, 2022

PocketNet: Extreme Lightweight Face Recognition Network using Neural Architecture Search and Multi-Step Knowledge Distillation

PocketNet This is the official repository of the paper: PocketNet: Extreme Lightweight Face Recognition Network using Neural Architecture Search and M

40 Dec 22, 2022

Multi-Person Extreme Motion Prediction

Multi-Person Extreme Motion Prediction Implementation for paper Wen Guo, Xiaoyu Bie, Xavier Alameda-Pineda, Francesc Moreno-Noguer, Multi-Person Extre

38 Nov 15, 2022

Patient-Survival - Using Python, I developed a Machine Learning model using classification techniques such as Random Forest and SVM classifiers to predict a patient's survival status that have undergone breast cancer surgery.

Patient-Survival - Using Python, I developed a Machine Learning model using classification techniques such as Random Forest and SVM classifiers to predict a patient's survival status that have undergone breast cancer surgery.

1 Dec 28, 2021

Extremely simple and fast extreme multi-class and multi-label classifiers.

Related tags

Overview

napkinXC

Python Quick Start and Documentation

Executable

References and acknowledgments

Comments

Releases(0.6.2)

0.6.2(Oct 17, 2022)

0.6.1(Sep 3, 2022)

0.6.0(Mar 22, 2022)

0.5.2(May 17, 2021)

0.5.1(Feb 6, 2021)

0.5.0(Feb 2, 2021)

0.4.2(Nov 4, 2020)

0.4.1(Oct 26, 2020)

0.4.0(Sep 7, 2020)

Owner

Marek Wydmuch

DECAF: Deep Extreme Classification with Label Features

Label Mask for Multi-label Classification

WarpDrive: Extremely Fast End-to-End Deep Multi-Agent Reinforcement Learning on a GPU

Official implementation for the paper: "Multi-label Classification with Partial Annotations using Class-aware Selective Loss"

Pre-trained Deep Learning models and demos (high quality and extremely fast)

Official implementation of "Open-set Label Noise Can Improve Robustness Against Inherent Label Noise" (NeurIPS 2021)

A PyTorch implementation of ICLR 2022 Oral paper PiCO: Contrastive Label Disambiguation for Partial Label Learning

Python Tensorflow 2 scripts for detecting objects of any class in an image without knowing their label.

Python TFLite scripts for detecting objects of any class in an image without knowing their label.

Extremely easy multi instancing software for minecraft speedrunning.

An extremely simple, intuitive, hardware-friendly, and well-performing network structure for LiDAR semantic segmentation on 2D range image. IROS21

PocketNet: Extreme Lightweight Face Recognition Network using Neural Architecture Search and Multi-Step Knowledge Distillation

Multi-Person Extreme Motion Prediction

Patient-Survival - Using Python, I developed a Machine Learning model using classification techniques such as Random Forest and SVM classifiers to predict a patient's survival status that have undergone breast cancer surgery.

Effect of Different Encodings and Distance Functions on Quantum Instance-based Classifiers

A data-driven approach to quantify the value of classifiers in a machine learning ensemble.

Minimal implementation of Denoised Smoothing: A Provable Defense for Pretrained Classifiers in TensorFlow.

This implements one of result networks from Large-scale evolution of image classifiers

Face recognition with trained classifiers for detecting objects using OpenCV