LowRankModels.jl is a julia package for modeling and fitting generalized low rank models.

Madeleine Udell

Last update: Dec 17, 2022

Related tags

Deep Learning LowRankModels.jl

Overview

LowRankModels.jl

LowRankModels.jl is a Julia package for modeling and fitting generalized low rank models (GLRMs). GLRMs model a data array by a low rank matrix, and include many well known models in data analysis, such as principal components analysis (PCA), matrix completion, robust PCA, nonnegative matrix factorization, k-means, and many more.

For more information on GLRMs, see our paper. There is a python interface to this package, and a GLRM implementation in the H2O machine learning platform with interfaces in a variety of languages.

LowRankModels.jl makes it easy to mix and match loss functions and regularizers to construct a model suitable for a particular data set. In particular, it supports

using different loss functions for different columns of the data array, which is useful when data types are heterogeneous (e.g., real, boolean, and ordinal columns);
fitting the model to only some of the entries in the table, which is useful for data tables with many missing (unobserved) entries; and
adding offsets and scalings to the model without destroying sparsity, which is useful when the data is poorly scaled.

Installation

To install, just call

Pkg.add("LowRankModels")

at the Julia prompt.

Generalized Low Rank Models

GLRMs form a low rank model for tabular data A with m rows and n columns, which can be input as an array or any array-like object (for example, a data frame). It is fine if only some of the entries have been observed (i.e., the others are missing); the GLRM will only be fit on the !ismissing entries. The desired model is specified by choosing a rank k for the model, an array of loss functions losses, and two regularizers, rx and ry. The data is modeled as X'*Y, where X is a kxm matrix and Y is a kxn matrix. X and Y are found by solving the optimization problem

minimize sum_{(i,j) in obs} losses[j]((X'*Y)[i,j], A[i,j]) + sum_i rx(X[:,i]) + sum_j ry(Y[:,j])

The basic type used by LowRankModels.jl is the GLRM. To form a GLRM, the user specifies

the data A (any AbstractArray, such as an array, a sparse matrix, or a data frame)
the array of loss functions losses
the regularizers rx and ry
the rank k

The user may also specify

the observed entries obs
starting matrices X₀ and Y₀

obs is a list of tuples of the indices of the observed entries in the matrix, and may be omitted if all the entries in the matrix have been observed. If A is a sparse matrix, implicit zeros are interpreted as missing entries by default; see the discussion of sparse matrices below for more details. X₀ and Y₀ are initialization matrices that represent a starting guess for the optimization.

Losses and regularizers must be of type Loss and Regularizer, respectively, and may be chosen from a list of supported losses and regularizers, which include

Losses:

quadratic loss QuadLoss
hinge loss HingeLoss
logistic loss LogisticLoss
Poisson loss PoissonLoss
weighted hinge loss WeightedHingeLoss
l1 loss L1Loss
ordinal hinge loss OrdinalHingeLoss
periodic loss PeriodicLoss
multinomial categorical loss MultinomialLoss
multinomial ordinal (aka ordered logit) loss OrderedMultinomialLoss
bigger-vs-smaller loss BvSLoss (for ordinal data)
one-vs all-loss OvALoss (for categorical data)

The constructors for all the ordinal and categorical losses take as an argument the maximum (or both minimum and maximum) value the variable may take. Using the one-vs-all loss is equivalent to transforming a categorical value to a one-hot vector and using a binary loss on each entry in that vector. Using the bigger-vs-smaller loss is equivalent to transforming the ordinal value to a Boolean vector and using a binary loss on each entry in that vector. By default, the binary loss used is the logistic loss.

Regularizers:

quadratic regularization QuadReg
constrained squared euclidean norm QuadConstraint
l1 regularization OneReg
no regularization ZeroReg
nonnegative constraint NonNegConstraint (e.g., for nonnegative matrix factorization)
1-sparse constraint OneSparseConstraint (e.g., for orthogonal NNMF)
unit 1-sparse constraint UnitOneSparseConstraint (e.g., for k-means)
simplex constraint SimplexConstraint
l1 regularization, combined with nonnegative constraint NonNegOneReg
fix features at values y0 FixedLatentFeaturesConstraint(y0)

Each of these losses and regularizers can be scaled (for example, to increase the importance of the loss relative to the regularizer) by calling mul!(loss, newscale). Users may also implement their own losses and regularizers, or adjust internal parameters of the losses and regularizers; see losses.jl and regularizers.jl for more details.

Example

For example, the following code forms a k-means model with k=5 on the 100x100 matrix A:

using LowRankModels
m, n, k = 100, 100, 5
losses = QuadLoss() # minimize squared distance to cluster centroids
rx = UnitOneSparseConstraint() # each row is assigned to exactly one cluster
ry = ZeroReg() # no regularization on the cluster centroids
glrm = GLRM(A, losses, rx, ry, k)

To fit the model, call

X, Y, ch = fit!(glrm)

which runs an alternating directions proximal gradient method on glrm to find the X and Y minimizing the objective function. (ch gives the convergence history; see Technical details below for more information.)

The losses argument can also be an array of loss functions, with one for each column (in order). For example, for a data set with 3 columns, you could use

losses = Loss[QuadLoss(), LogisticLoss(), HingeLoss()]

Similiarly, the ry argument can be an array of regularizers, with one for each column (in order). For example, for a data set with 3 columns, you could use

ry = Regularizer[QuadReg(1), QuadReg(10), FixedLatentFeaturesConstraint([1.,2.,3.])]

This regularizes the first to columns of Y with ||Y[:,1]||^2 + 10||Y[:,2]||^2 and constrains the third (and last) column of Y to be equal to [1,2,3].

More examples here.

Missing data

If not all entries are present in your data table, just tell the GLRM which observations to fit the model to by listing tuples of their indices in obs, e.g., if obs=[(1,2), (5,3)], exactly two entries have been observed. Then initialize the model using

GLRM(A, losses, rx, ry, k, obs=obs)

If A is a DataFrame and you just want the model to ignore any entry that is missing, you can use

obs = observations(A)

Standard low rank models

Low rank models can easily be used to fit standard models such as PCA, k-means, and nonnegative matrix factorization. The following functions are available:

pca: principal components analysis
qpca: quadratically regularized principal components analysis
rpca: robust principal components analysis
nnmf: nonnegative matrix factorization
k-means: k-means

See the code for usage. Any keyword argument valid for a GLRM object, such as an initial value for X or Y or a list of observations, can also be used with these standard low rank models.

Scaling and offsets

If you choose, LowRankModels.jl can add an offset to your model and scale the loss functions and regularizers so all columns have the same pull in the model. Simply call

glrm = GLRM(A, losses, rx, ry, k, offset=true, scale=true)

This transformation generalizes standardization, a common preprocessing technique applied before PCA. (For more about offsets and scaling, see the code or the paper.)

You can also add offsets and scalings to previously unscaled models:

Add an offset to the model (by applying no regularization to the last row of the matrix Y, and enforcing that the last column of X be all 1s) using

add_offset!(glrm)

Scale the loss functions and regularizers by calling

equilibrate_variance!(glrm)

Scale only the columns associated to QuadLoss or HuberLoss loss functions.

prob_scale!(glrm)

Fitting DataFrames

Perhaps all this sounds like too much work. Perhaps you happen to have a DataFrame df lying around that you'd like a low rank (e.g., k=2) model for. For example,

import RDatasets
df = RDatasets.dataset("psych", "msq")

Never fear! Just call

glrm, labels = GLRM(df, k)
X, Y, ch = fit!(glrm)

This will fit a GLRM with rank k to your data, using a QuadLoss loss for real valued columns, HingeLoss loss for boolean columns, and ordinal HingeLoss loss for integer columns, a small amount of QuadLoss regularization, and scaling and adding an offset to the model as described here. It returns the column labels for the columns it fit, along with the model. Right now, all other data types are ignored. NaN values are treated as missing values (missings) and ignored in the fit.

The full call signature is

function GLRM(df::DataFrame, k::Int;
              losses = Loss[], rx = QuadReg(.01), ry = QuadReg(.01),
              offset = true, scale = false,
              prob_scale = true, NaNs_to_NAs = true)

You can modify the losses or regularizers, or turn off offsets or scaling, using these keyword arguments.

Or to specify a map from data types to losses, define a new loss_map from datatypes to losses (like probabilistic_losses, below):

probabilistic_losses = Dict{Symbol, Any}(
    :real        => QuadLoss,
    :bool        => LogisticLoss,
    :ord         => MultinomialOrdinalLoss,
    :cat         => MultinomialLoss
)

and input an array of datatypes (one for each column of your data frame: GLRM(A, k, datatypes; loss_map = loss_map). The full call signature is

function GLRM(df::DataFrame, k::Int, datatypes::Array{Symbol,1};
              loss_map = probabilistic_losses,
              rx = QuadReg(.01), ry = QuadReg(.01),
              offset = true, scale = false, prob_scale = true,
              transform_data_to_numbers = true, NaNs_to_NAs = true)

You can modify the losses or regularizers, or turn off offsets or scaling, using these keyword arguments.

To fit a data frame with categorical values, you can use the function expand_categoricals! to turn categorical columns into a Boolean column for each level of the categorical variable. For example, expand_categoricals!(df, [:gender]) will replace the gender column with a column corresponding to gender=male, a column corresponding to gender=female, and other columns corresponding to labels outside the gender binary, if they appear in the data set.

You can use the model to get some intuition for the data set. For example, try plotting the columns of Y with the labels; you might see that similar features are close to each other!

Fitting Sparse Matrices

If you have a very large, sparsely observed dataset, then you may want to encode your data as a sparse matrix. By default, LowRankModels interprets the sparse entries of a sparse matrix as missing entries (i.e. NA values). There is no need to pass the indices of observed entries (obs) -- this is done automatically when GLRM(A::SparseMatrixCSC,...) is called. In addition, calling fit!(glrm) when glrm.A is a sparse matrix will use the sparse variant of the proximal gradient descent algorithm, fit!(glrm, SparseProxGradParams(); kwargs...).

If, instead, you'd like to interpret the sparse entries as zeros, rather than missing or NA entries, use:

glrm = GLRM(...; sparse_na=false)

In this case, the dataset is dense in terms of observations, but sparse in terms of nonzero values. Thus, it may make more sense to fit the model with the vanilla proximal gradient descent algorithm, fit!(glrm, ProxGradParams(); kwargs...).

Parallel fitting (experimental)

LowRankModels makes use of Julia v0.5's new multithreading functionality to fit models in parallel. To fit a LowRankModel in parallel using multithreading, simply set the number of threads from the command line before starting Julia: e.g.,

export JULIA_NUM_THREADS=4

Technical details

Optimization

The function fit! uses an alternating directions proximal gradient method to minimize the objective. This method is not guaranteed to converge to the optimum, or even to a local minimum. If your code is not converging or is converging to a model you dislike, there are a number of parameters you can tweak.

Warm start

The algorithm starts with glrm.X and glrm.Y as the initial estimates for X and Y. If these are not given explicitly, they will be initialized randomly. If you have a good guess for a model, try setting them explicitly. If you think that you're getting stuck in a local minimum, try reinitializing your GLRM (so as to construct a new initial random point) and see if the model you obtain improves.

The function fit! sets the fields glrm.X and glrm.Y after fitting the model. This is particularly useful if you want to use the model you generate as a warm start for further iterations. If you prefer to preserve the original glrm.X and glrm.Y (e.g., for cross validation), you should call the function fit, which does not mutate its arguments.

You can even start with an easy-to-optimize loss function, run fit!, change the loss function (glrm.losses = newlosses), and keep going from your warm start by calling fit! again to fit the new loss functions.

Initialization

If you don't have a good guess at a warm start for your model, you might try one of the initializations provided in LowRankModels.

init_svd! initializes the model as the truncated SVD of the matrix of observed entries, with unobserved entries filled in with zeros. This initialization is known to result in provably good solutions for a number of "PCA-like" problems. See our paper for details.
init_kmeanspp! initializes the model using a modification of the kmeans++ algorithm for data sets with missing entries; see our paper for details. This works well for fitting clustering models, and may help in achieving better fits for nonnegative matrix factorization problems as well.
init_nndsvd! initializes the model using a modification of the NNDSVD algorithm as implemented by the NMF package. This modification handles data sets with missing entries by replacing missing entries with zeros. Optionally, by setting the argument max_iters=n with n>0, it will iteratively replace missing entries by their values as imputed by the NNDSVD, and call NNDSVD again on the new matrix. (This procedure is similar to the soft impute method of Mazumder, Hastie and Tibshirani for matrix completion.)

Parameters

As mentioned earlier, LowRankModels uses alternating proximal gradient descent to derive estimates of X and Y. This can be done by two slightly different procedures: (A) compute the full reconstruction, X' * Y, to compute the gradient and objective function; (B) only compute the model estimate for entries of A that are observed. The first method is likely preferred when there are few missing entries for A because of hardware level optimizations (e.g. chunking the operations so they just fit in various caches). The second method is likely preferred when there are many missing entries of A.

To fit with the first (dense) method:

fit!(glrm, ProxGradParams(); kwargs...)

To fit with the second (sparse) method:

fit!(glrm, SparseProxGradParams(); kwargs...)

The first method is used by default if glrm.A is a standard matrix/array. The second method is used by default if glrm.A is a SparseMatrixCSC.

ProxGradParams() and SparseProxGradParams() run these respective methods with the default parameters:

stepsize: The step size controls the speed of convergence. Small step sizes will slow convergence, while large ones will cause divergence. stepsize should be of order 1.
abs_tol: The algorithm stops when the decrease in the objective per iteration is less than abs_tol*length(obs).
rel_tol: The algorithm stops when the decrease in the objective per iteration is less than rel_tol.
max_iter: The algorithm also stops if maximum number of rounds max_iter has been reached.
min_stepsize: The algorithm also stops if stepsize decreases below this limit.
inner_iter: specifies how many proximal gradient steps to take on X before moving on to Y (and vice versa).

The default parameters are: ProxGradParams(stepsize=1.0;max_iter=100,inner_iter=1,abs_tol=0.00001,rel_tol=0.0001,min_stepsize=0.01*stepsize)

Convergence

ch gives the convergence history so that the success of the optimization can be monitored; ch.objective stores the objective values, and ch.times captures the times these objective values were achieved. Try plotting this to see if you just need to increase max_iter to converge to a better model.

Imputation

After fitting a GLRM, you can use it to impute values of A in four different ways:

impute(glrm) gives the maximum likelihood estimates for each entry
impute_missing(glrm) imputes missing entries and leaves observed entries unchanged
sample(glrm) gives a draw from the posterior distribution, conditioned on the fit values of X and Y, for each entry
sample_missing(glrm) samples missing entries and leaves observed entries unchanged

Cross validation

A number of useful functions are available to help you check whether a given low rank model overfits to the test data set. These functions should help you choose adequate regularization for your model.

Cross validation

cross_validate(glrm::GLRM, nfolds=5, params=Params(); verbose=false, use_folds=None, error_fn=objective, init=None): performs n-fold cross validation and returns average loss among all folds. More specifically, splits observations in glrm into nfolds groups, and builds new GLRMs, each with one group of observations left out. Fits each GLRM to the training set (the observations revealed to each GLRM) and returns the average loss on the test sets (the observations left out of each GLRM).

Optional arguments:
- use_folds: build use_folds new GLRMs instead of n_folds new GLRMs, each with 1/nfolds of the entries left out. (use_folds defaults to nfolds.)
- error_fn: use a custom error function to evaluate the fit, rather than the objective. For example, one might use the imputation error by setting error_fn = error_metric.
- init: initialize the fit using a particular procedure. For example, consider init=init_svd!. See Initialization for more options.
cv_by_iter(glrm::GLRM, holdout_proportion=.1, params=Params(1,1,.01,.01), niters=30; verbose=true): computes the test error and train error of the GLRM as it is trained. Splits the observations into a training set (1-holdout_proportion of the original observations) and a test set (holdout_proportion of the original observations). Performs params.maxiter iterations of the fitting algorithm on the training set niters times, and returns the test and train error as a function of iteration.

Regularization paths

regularization_path(glrm::GLRM; params=Params(), reg_params=exp10.(range(2,stop=-2,length=5)), holdout_proportion=.1, verbose=true, ch::ConvergenceHistory=ConvergenceHistory("reg_path")): computes the train and test error for GLRMs varying the scaling of the regularization through any scaling factor in the array reg_params.

Utilities

get_train_and_test(obs, m, n, holdout_proportion=.1): splits observations obs into a train and test set. m and n must be at least as large as the maximal value of the first or second elements of the tuples in observations, respectively. Returns observed_features and observed_examples for both train and test sets.

ScikitLearn

This library implements the ScikitLearn.jl interface. These models are available: SkGLRM, PCA, QPCA, NNMF, KMeans, RPCA. See their docstrings for more information (e.g. ?QPCA). All models support the ScikitLearnBase.fit! and ScikitLearnBase.transform interface. Examples:

## Apply PCA to the iris dataset
using LowRankModels
import ScikitLearnBase
using RDatasets    # may require Pkg.add("RDatasets")

A = convert(Matrix, dataset("datasets", "iris")[[:SepalLength, :SepalWidth, :PetalLength, :PetalWidth]])
ScikitLearnBase.fit_transform!(PCA(k=3, max_iter=500), A)

## Fit K-Means to a fake dataset of two Gaussians
using LowRankModels
import ScikitLearnBase

# Generate two disjoint Gaussians with 100 and 50 points
gaussian1 = randn(100, 2) + 5
gaussian2 = randn(50, 2) - 10
# Merge them into a single dataset
A = vcat(gaussian1, gaussian2)

model = ScikitLearnBase.fit!(LowRankModels.KMeans(), A)
# Count how many points are assigned to each Gaussians (should be 100 and 50)
Set(sum(ScikitLearnBase.transform(model, A), 1))

These models can be used inside a ScikitLearn pipeline, and every hyperparameter can be tuned with GridSearchCV.

Citing this package

If you use LowRankModels for published work, we encourage you to cite the software.

Use the following BibTeX citation:

@article{glrm,
    title = {Generalized Low Rank Models},
    author ={Madeleine Udell and Horn, Corinne and Zadeh, Reza and Boyd, Stephen},
    doi = {10.1561/2200000055},
    year = {2016},
    archivePrefix = "arXiv",
    eprint = {1410.0342},
    primaryClass = "stat-ml",
    journal = {Foundations and Trends in Machine Learning},
    number = {1},
    volume = {9},
    issn = {1935-8237},
    url = {http://dx.doi.org/10.1561/2200000055},
}

Comments

Penalties/regularization of imputed values

In many experimental contexts (RNA-sequencing is a good example), ground truth data that are below a detection threshold are not observed due to technical error. Thus, we have a data matrix A with many NA entries. However, we suspect many of these NA entries to be small, though not necessarily zero.

For each observed entry A_ij we have a loss function: L[A_ij, dot(x_i,y_j)]

Maybe for each unobserved entry we could add some regularization: R_a[dot(x_i,y_j)]

Where R_a is a function defined by the user... Perhaps R_a[z] = z^2 or R_a[z] = abs(z)

Aside: For the RNA sequencing application, something like R_a[z] = sqrt(z), z > 0 would be interesting (though not convex). Actually even something non-monotonic would be interesting for reasons I won't go into. These are probably too weird/specialized to include, but I would be curious.

opened by ahwillia 14
ScikitLearn.jl wrap
get-params wasn't working out with the current type definition. I've decided to wrap GLRM instead, it's going to be more consistent with the rest of the ScikitLearn.jl algorithms, and allows me to expose more hyperparameters.

I branched from dataframe-ux this time.

PCA works fine, but I'm having issues with K-Means. I've translated this example, and instead of getting a nice Voronoi diagram, I get this:

Clearly not random, but not quite Voronoi either! To get this picture, I call transform(kmeans, X) where X is a two-column matrix containing all (x,y) coordinates. transform uses FixedLatentFeaturesConstraint. Is there any reason why this would be incompatible with K-Means? Its output doesn't look right, even with abs_tol=1.e-20 (it converges in ~20 iterations)

2430x10 Array{Float64,2}: 0.0 0.0 0.0 … 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 -0.570026 0.714604 -2.94676 2.10088 2.1646 -1.42798 0.191742 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 … 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ...

(10 columns because there are 10 digits). Those 10 rows correspond to neighboring pixels. They should all be classified in the same cluster. But even ignoring the anomaly on row 4, the other rows are not consistent at all.

It would be straight-forward to implement transform using the distance from each centroid, but if there's something wrong with transform, it's better to fix that instead. Any ideas?
opened by cstjean 9
initial implementation of nndsvd

Here is a quick implementation of nndsvd.

We should probably develop a few test cases to make sure this is working as expected before merging. But it seems to be behaving reasonably in my hands so far...

Note: Sorry for the duplicate pull request. I'll delete the other one.

opened by ahwillia 8
Working on randomly generated data returns 0 element Arrays for labels and glrm.ry

Hi again,

I was trying to debug my analysis by working with a randomly generated dataset:

A = rand(100,2)*rand(2,100) A = convert(DataFrame, A) glrm, labels = GLRM(A, 2)

However the corresponding labels and glrm.ry objects are empty:

julia> labels 0-element Array{Symbol,1}

julia> glrm.ry 0-element Array{Regularizer,1}

It would be great if you could help me understand why this is happening?

Also could you suggest a base dataset that could be used to understand the different aspects of the LowRankModels code like the loss and regularizer options? I tried using the "psych" Dataset referred to in the README and this randomly generated matrix and ran into issues with them both.

Thanks a lot, Nandana

opened by NandanaSengupta 6
WIP: Support for the ScikitLearn API

I followed your suggestion. The result is not the prettiest, but it works. If it looks reasonable, I'll add the remaining functions and support NMF/KMeans/etc.

I'd like to understand why you copy Regularizer and Loss objects. Are they modified anywhere?

opened by cstjean 5
Compute X given Y

Once a model has been fit to a matrix A, is there any way to fit it to another matrix holding Y constant? For example, if factor analysis is part of a pipeline that ends with an SVM classifier, the cross-validation code should learn the feature matrix Y on the training set, and compute the data matrix X on the test set, given Y.

opened by cstjean 5
Change loss type names

Changes all loss function types to have CamelCase names, preventing naming conflicts with StatsBase. See: https://github.com/madeleineudell/LowRankModels.jl/issues/41

I can also rename all the regularization types to CamelCase if desired...

opened by ahwillia 5

Regularizers for Soft K-means

I've implemented some new regularizers that let you move between a soft and hard clustering. Here is a quick demo/explanation:

http://nbviewer.ipython.org/github/ahwillia/notebooks/blob/master/code_pubs/2015_08_11_regularized_soft_kmeans.ipynb

Is this too specialized to include in the repository? The code is pretty short so I've just pasted it below. Let me know if you want me to open a PR.

## indicator of vectors in the simplex, plus a penalty on Shannon entropy
## (intuition: soft k-means with encouraged sparseness)
type entr_simplex<:Regularizer
    scale::Float64
end
function evaluate(r::entr_simplex,a::AbstractArray)
    evaluate(simplex(),a) == Inf && return Inf # simplex constraint
    b = length(a) # base for entropy calculation (normalizes: 0<=entropy<=1)
    return r.scale*entropy(a,b) # penalize entropy if constraint satisfied
end
function prox!(r::entr_simplex,u::AbstractArray,alpha::Number)
    prox!(simplex(),u,alpha) # first project onto unit simplex
    b = length(u)    # base for entropy calculation (normalizes: 0<=entropy<=1)
    g = -log(b,u)-1  # gradient of entropy

    # project entropy gradient onto simplex, ignoring Infs
    gs = g - mean(g[g.!=Inf]) 
    for i = 1:length(u)
        u[i] = (g[i]==Inf) ? 0.0 : u[i] - r.scale*alpha*gs[i]
    end

    # make sure we didn't step off the simplex before returning
    prox!(simplex(),u,alpha)
end
entr_simplex() = entr_simplex(1)

## indicator of vectors in the simplex, plus a penalty on
## the l1 distance to the center of the simplex
## (intuition: soft k-means with encouraged sparseness)
type dist_simplex<:Regularizer
    scale::Float64
    k::Int # rank of model (store up front)
    d::Float64 # distance from corner to center (calculate up front)
end
function evaluate(r::dist_simplex,a::AbstractArray)
    evaluate(simplex(),a) == Inf && return Inf # simplex constraint
    dist = sum(abs(a-ones(r.k)/r.k)) # distance from center
    return r.scale*(1 - (dist/r.d))  # penalize dist from corners
end
function prox(r::dist_simplex,u::AbstractArray,alpha::Number)
    prox!(simplex(),u,alpha) # first project onto unit simplex

    # Calculate gradient
    shrink_step = -r.scale*alpha/(r.k-1)
    imax = indmax(u)
    for i = 1:length(u)
        if i == imax
            u[i] += r.scale*alpha
        else
            u[i] += shrink_step
        end
    end

    # make sure we didn't step off the simplex before returning
    prox!(simplex(),u,alpha)
end
function dist_simplex(s::Float64,k::Int)
    # k is the rank of the data
    e = zeros(k); e[1] = 1.0  # corner of simplex
    c = ones(k)/k             # center of simplex
    return dist_simplex(s,k,sum(abs(c-e)))
end
dist_simplex(k::Int) = dist_simplex(1.0,k)

opened by ahwillia 5

Use Sparse Matrices

I have a few applications that involve very large, very sparse (incomplete), matrices.

At the moment it appears that the entire data matrix, A, is estimated/reconstructed from X'*Y on each gradient step (see here). However, if many entries of A are missing, then only a subset of the X'*Y entries need to be computed to evaluate the loss function.

Would it make sense to make a new file sparse_proxgrad.jl in src/algorithms/ that is called when glrm.A is a sparse matrix?

opened by ahwillia 5
@threads causes massive slowdown

I find, while running julia v1.2 and LowRankModels v1.0.1, that the @threads annotation here sometimes causes massive slowdowns (3x in execution time and CPU utilization very low). My problem size is m,n,k = 200,150,10 and nthreads() == 2. Removing the annotation makes everything much faster.

opened by baggepinnen 4
One regularizer per column for X?

I've noticed that while one can pass an array of regularizers for Y (i.e. one per column of Y), the same is not possible for X. Is there a reason for that? Is there any interest in generalizing the code in that direction? Maybe I could help!

opened by amanoel 4
Compatibility and dependency updates

I tried to make the minimal changes here to make this package work with recent versions of Julia and package dependencies. Currently this package drastically downgrades DataFrames and probably other packages on installation. All tests seem to pass for me.

opened by kshedden 1
Use a more recent version of DataFrames

DataFrames has now released their v1.0, but this repo still locks its dependency to some prior versions, which can lead to compat issues with other packages. It would be great to relax that dependency.

opened by eperim 1

support `missing` in matrix rather than having to give `obs`

Since julia 1.0 has missing built in it would be nice to just naturally support it. Rather than having to pass in obs specifically.

Example using pca.

julia> data = map(x->rand()>0.2 ? x : missing, rand(5, 5)*rand(5,5))
5×5 Matrix{Union{Missing, Float64}}:
 1.06069   1.08958   missing  1.67055   1.20225
 0.751405  1.1944   1.31671    missing  1.23701
 1.33475   1.62695   missing  2.05511   1.52327
 0.909519  1.18476  1.70125   1.93304   1.3857
  missing  1.07582  1.23387   1.57813   1.09896

julia> pca(data, 3)
ERROR: TypeError: non-boolean (Missing) used in boolean context
Stacktrace:
 [1] GLRM(A::Matrix{Union{Missing, Float64}}, losses::Vector{Loss}, rx::Vector{Regularizer}, ry::Vector{Regularizer}, k::Int64; X::Matrix{Float64}, Y::Matrix{Float64}, obs::Nothing, observed_features::Vector{UnitRange{Int64}}, observed_examples::Vector{UnitRange{Int64}}, offset::Bool, scale::Bool, checknan::Bool, sparse_na::Bool)
   @ LowRankModels ~/JuliaEnvs/LowRankModels.jl/src/glrm.jl:66
 [2] GLRM(A::Matrix{Union{Missing, Float64}}, losses::Vector{Loss}, rx::Vector{Regularizer}, ry::Vector{Regularizer}, k::Int64)
   @ LowRankModels ~/JuliaEnvs/LowRankModels.jl/src/glrm.jl:38
 [3] #GLRM#172
   @ ~/JuliaEnvs/LowRankModels.jl/src/utilities/conveniencemethods.jl:48 [inlined]
 [4] GLRM
   @ ~/JuliaEnvs/LowRankModels.jl/src/utilities/conveniencemethods.jl:48 [inlined]
 [5] #pca#107
   @ ~/JuliaEnvs/LowRankModels.jl/src/simple_glrms.jl:8 [inlined]
 [6] pca(A::Matrix{Union{Missing, Float64}}, k::Int64)
   @ LowRankModels ~/JuliaEnvs/LowRankModels.jl/src/simple_glrms.jl:6
 [7] top-level scope
   @ REPL[34]:1

It's not to go and list all the obs manually and the it works:

julia> obs = [Tuple(ind) for ind in CartesianIndices(data) if !(data[ind] isa Missing)];

julia> pca(data, 3; obs=obs)
GLRM(Union{Missing, Float64}[1.0606880432831138 1.089579677823029 … 1.6705515670551134 1.2022467329602542; 0.7514051967634282 1.1944014349828096 … missing 1.2370061047120438; … ; 0.9095188136002703 1.1847633026389974 … 1.9330409876767132 1.38570067320635; missing 1.0758247860288004 … 1.5781294994670754 1.0989562029066084], Loss[QuadLoss(1.0, RealDomain()), QuadLoss(1.0, RealDomain()), QuadLoss(1.0, RealDomain()),
 QuadLoss(1.0, RealDomain()), QuadLoss(1.0, RealDomain())], Regularizer[ZeroReg(), ZeroReg(), ZeroReg(),
 ZeroReg(), ZeroReg()], Regularizer[ZeroReg(), ZeroReg(), ZeroReg(), ZeroReg(), ZeroReg()], 3, [[1, 2, 4
, 5], [1, 2, 3, 5], [1, 2, 4, 5], [1, 2, 3, 4, 5], [2, 3, 4, 5]], [[1, 2, 3, 4], [1, 2, 3, 4, 5], [2, 4,
 5], [1, 3, 4, 5], [1, 2, 3, 4, 5]], [-0.6386482530950347 -1.018777729621554 … -1.9382553356751036 -1.053052636035794; -0.14517788192107056 -1.2458123989543037 … -2.2837186272581693 -1.5608357286351906; 0.2540552369948007 -1.4235247052609152 … 0.2976611081413319 0.7210816049116382], [0.4108564480213463 -1.0415581335449255 … 0.36006578018169627 0.11669723598708971; 0.05544466402953537 0.7168696680990544 … 0.19156334692424293 -0.48792765211129047; -0.8552253694850784 -1.5810995605873126 … -0.8580967626857172 -0.4419037908246037])

but we should make it happen automatically.

@jiahao said he has a fix for this.

opened by oxinabox 0

fix NaNtoMissing

running the tests was erroring with:

ERROR: LoadError: LoadError: LoadError: type CategoricalValue has no field value
Stacktrace:
  [1] getproperty(x::CategoricalValue{String, UInt8}, f::Symbol)
    @ Base ./Base.jl:33
  [2] isnan(x::CategoricalValue{String, UInt8})
    @ LowRankModels ~/JuliaEnvs/LowRankModels.jl/src/fit_dataframe.jl:275
  [3] (::LowRankModels.var"#147#148"{DataFrame, Int64})(::Tuple{Int64, CategoricalValue{String, UInt8}})
    @ LowRankModels ./none:0
  [4] iterate
    @ ./generator.jl:47 [inlined]
  [5] collect(itr::Base.Generator{Base.Iterators.Enumerate{CategoricalVector{String, UInt8, String, Cate
goricalValue{String, UInt8}, Union{}}}, LowRankModels.var"#147#148"{DataFrame, Int64}})
    @ Base ./array.jl:678
  [6] NaNs_to_Missing!(df::DataFrame)
    @ LowRankModels ~/JuliaEnvs/LowRankModels.jl/src/fit_dataframe.jl:281
  [7] GLRM(df::DataFrame, k::Int64; losses::Vector{Loss}, rx::QuadReg, ry::QuadReg, offset::Bool, scale:
:Bool, prob_scale::Bool, NaNs_to_Missing::Bool)
    @ LowRankModels ~/JuliaEnvs/LowRankModels.jl/src/fit_dataframe_w_type_imputation.jl:17
  [8] GLRM(df::DataFrame, k::Int64)
    @ LowRankModels ~/JuliaEnvs/LowRankModels.jl/src/fit_dataframe_w_type_imputation.jl:15
  [9] top-level scope
    @ ~/JuliaEnvs/LowRankModels.jl/examples/fit_rdataset.jl:9
 [10] include(fname::String)
    @ Base.MainInclude ./client.jl:444
 [11] top-level scope
    @ ~/JuliaEnvs/LowRankModels.jl/examples/runexamples.jl:17
 [12] include(fname::String)
    @ Base.MainInclude ./client.jl:444
 [13] top-level scope
    @ ~/JuliaEnvs/LowRankModels.jl/test/runtests.jl:34
 [14] include(fname::String)
    @ Base.MainInclude ./client.jl:444
 [15] top-level scope
    @ none:6
in expression starting at /Users/oxinabox/JuliaEnvs/LowRankModels.jl/examples/fit_rdataset.jl:9
in expression starting at /Users/oxinabox/JuliaEnvs/LowRankModels.jl/examples/runexamples.jl:17
in expression starting at /Users/oxinabox/JuliaEnvs/LowRankModels.jl/test/runtests.jl:34
ERROR: Package LowRankModels errored during testing

Looking at the code for NaNtoMissing it was a bit funky. I made it less funky.

Now tests pass without errors for me

opened by oxinabox 1

Applying model to new data
Hi there, This looks a great package. I'm particularly interested in the ability to fit LRMs to datasets with missing data (or in my case, outliers that need to be masked). I have a quick question that may be pretty basic, but an answer would help me to apply the code to my own data. Apologies if I've missed something in the documentation. I'm also fairly new to Julia.

If I fit a PCA model to a set of training data A (following your example):

loss = QuadLoss() r = ZeroReg() n_comp = 1 glrm = GLRM(A,loss,r,r,n_comp) X,Y,ch. = fit!(glrm)

how do I then apply the same model to a new set of data B? I would like to keep X fixed and obtain new values Y_b that give the best fit of X to B. That is, I would like to project the observations in B onto the PCA components found from A.

There are other PCA packages in Julia that will do this (e.g., the reconstruct function in MultivariateStats), but they don't seem to be able to handle missing data or sparse arrays.

Thanks in advance! Any help is appreciated!
opened by gdbeck 2

Releases(v1.1.1)

v1.1.1(Nov 12, 2020)
LowRankModels v1.1.1

Diff since v1.1.0

Closed issues:

Register new version (#110)

Source code(tar.gz)
Source code(zip)
v1.1.0(Oct 23, 2020)
LowRankModels v1.1.0

Diff since v1.0.2

Closed issues:

Register new version (#103)

Merged pull requests:

Install TagBot as a GitHub Action (#105) (@JuliaTagBot)

Add Array() initialize.jl line 16 (#108) (@kathy908000)

Modernize package (#109) (@madeleineudell)

Source code(tar.gz)
Source code(zip)
v1.0.2(Nov 9, 2019)
v1.0.2 (2019-11-09)

Diff since v1.0.1

Closed issues:

Register v1.0.2 (#102)

Register v1.0.1 (#101)

@threads causes massive slowdown (#98)

Register v1.0.0 (compatible with Julia 1.0+) (#92)

A non linear GLRM is good for visualization? (#89)

How to run distributely? (#88)

expand_categoricals function not working (#84)

Specifying MultinomialLoss() for multiple columns (#81)

Merged pull requests:

fix julia1 errors (#93) (matbesancon)

Source code(tar.gz)
Source code(zip)
v1.0.1(Jul 23, 2019)
v1.0.1 (2019-07-23)

Diff since v1.0.0

Closed issues:

What Julia version is this package compatible for? (#90)

Merged pull requests:

fix Symbol typo (#86) (ramcha24)

Source code(tar.gz)
Source code(zip)
v1.0.0(Jul 18, 2019)

This release updates functionality to work on Julia 1.0+

One minor syntax change: mul! has replaced scale!, in keeping with new conventions in Julia 1.0+.
Source code(tar.gz)
Source code(zip)
v0.2.4(Jun 4, 2019)

This release adds functionality related to missing value imputation using the new Missing type. This is also the last version of LowRankModels that will be compatible with Julia v0.6.
Source code(tar.gz)
Source code(zip)
v0.2.3(Mar 30, 2018)

This version fixes some numerical problems in fitting the Poisson Loss, and removes defunct requirements for version numbers of dependencies.
Source code(tar.gz)
Source code(zip)
v0.2.2(Nov 1, 2017)

This release updates LowRankModels to remove warnings on Julia v0.6.
Source code(tar.gz)
Source code(zip)
v0.2.0(Nov 1, 2017)

This release fixes deprecations for Julia v0.6.
Source code(tar.gz)
Source code(zip)

Owner

Madeleine Udell

GitHub

MicRank is a Learning to Rank neural channel selection framework where a DNN is trained to rank microphone channels.

MicRank: Learning to Rank Microphones for Distant Speech Recognition Application Scenario Many applications nowadays envision the presence of multiple

20 Nov 10, 2022

Code for "LoRA: Low-Rank Adaptation of Large Language Models"

LoRA: Low-Rank Adaptation of Large Language Models This repo contains the implementation of LoRA in GPT-2 and steps to replicate the results in our re

394 Jan 8, 2023

Gradient-free global optimization algorithm for multidimensional functions based on the low rank tensor train format

ttopt Description Gradient-free global optimization algorithm for multidimensional functions based on the low rank tensor train (TT) format and maximu

5 May 23, 2022

Learning hidden low dimensional dyanmics using a Generalized Onsager Principle and neural networks

OnsagerNet Learning hidden low dimensional dyanmics using a Generalized Onsager Principle and neural networks This is the original pyTorch implemenati

3 Aug 24, 2022

Pythonic particle-based (super-droplet) warm-rain/aqueous-chemistry cloud microphysics package with box, parcel & 1D/2D prescribed-flow examples in Python, Julia and Matlab

PySDM PySDM is a package for simulating the dynamics of population of particles. It is intended to serve as a building block for simulation systems mo

Atmospheric Cloud Simulation Group @ Jagiellonian University

32 Oct 18, 2022

Code for HLA-Face: Joint High-Low Adaptation for Low Light Face Detection (CVPR21)

HLA-Face: Joint High-Low Adaptation for Low Light Face Detection The official PyTorch implementation for HLA-Face: Joint High-Low Adaptation for Low L

77 Dec 8, 2022

Official code of "R2RNet: Low-light Image Enhancement via Real-low to Real-normal Network."

R2RNet Official code of "R2RNet: Low-light Image Enhancement via Real-low to Real-normal Network." Jiang Hai, Zhu Xuan, Ren Yang, Yutong Hao, Fengzhu

77 Dec 24, 2022

Joint parameterization and fitting of stroke clusters

StrokeStrip: Joint Parameterization and Fitting of Stroke Clusters Dave Pagurek van Mossel1, Chenxi Liu1, Nicholas Vining1,2, Mikhail Bessmeltsev3, Al

44 Dec 1, 2022

PyTorch implementation of the method described in the paper VoiceLoop: Voice Fitting and Synthesis via a Phonological Loop.

VoiceLoop PyTorch implementation of the method described in the paper VoiceLoop: Voice Fitting and Synthesis via a Phonological Loop. VoiceLoop is a n

873 Dec 15, 2022

torchbearer: A model fitting library for PyTorch

Note: We're moving to PyTorch Lightning! Read about the move here. From the end of February, torchbearer will no longer be actively maintained. We'll

631 Jan 4, 2023

torchbearer: A model fitting library for PyTorch

Note: We're moving to PyTorch Lightning! Read about the move here. From the end of February, torchbearer will no longer be actively maintained. We'll

632 Dec 13, 2022

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

Apache MXNet (incubating) for Deep Learning Apache MXNet is a deep learning framework designed for both efficiency and flexibility. It allows you to m

20.2k Jan 8, 2023

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

Apache MXNet (incubating) for Deep Learning Apache MXNet is a deep learning framework designed for both efficiency and flexibility. It allows you to m

20.2k Jan 5, 2023

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

Apache MXNet (incubating) for Deep Learning Apache MXNet is a deep learning framework designed for both efficiency and flexibility. It allows you to m

19.3k Feb 12, 2021

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

Apache MXNet (incubating) for Deep Learning Master Docs License Apache MXNet (incubating) is a deep learning framework designed for both efficiency an

29 Nov 16, 2022

LowRankModels.jl is a julia package for modeling and fitting generalized low rank models.

Related tags

Overview

LowRankModels.jl

Installation

Generalized Low Rank Models

Example

Missing data

Standard low rank models

Scaling and offsets

Fitting DataFrames

Fitting Sparse Matrices

Parallel fitting (experimental)

Technical details

Optimization

Warm start

Initialization

Parameters

Convergence

Imputation

Cross validation

Cross validation

Regularization paths

Utilities

ScikitLearn

Citing this package

Comments

Releases(v1.1.1)

v1.1.1(Nov 12, 2020)

LowRankModels v1.1.1

v1.1.0(Oct 23, 2020)

LowRankModels v1.1.0

v1.0.2(Nov 9, 2019)

v1.0.2 (2019-11-09)

v1.0.1(Jul 23, 2019)

v1.0.1 (2019-07-23)

v1.0.0(Jul 18, 2019)

v0.2.4(Jun 4, 2019)

v0.2.3(Mar 30, 2018)

v0.2.2(Nov 1, 2017)

v0.2.0(Nov 1, 2017)

Owner

Madeleine Udell

MicRank is a Learning to Rank neural channel selection framework where a DNN is trained to rank microphone channels.

Code for "LoRA: Low-Rank Adaptation of Large Language Models"

Gradient-free global optimization algorithm for multidimensional functions based on the low rank tensor train format

Learning hidden low dimensional dyanmics using a Generalized Onsager Principle and neural networks

Pythonic particle-based (super-droplet) warm-rain/aqueous-chemistry cloud microphysics package with box, parcel & 1D/2D prescribed-flow examples in Python, Julia and Matlab

Code for HLA-Face: Joint High-Low Adaptation for Low Light Face Detection (CVPR21)

Official code of "R2RNet: Low-light Image Enhancement via Real-low to Real-normal Network."

Joint parameterization and fitting of stroke clusters

PyTorch implementation of the method described in the paper VoiceLoop: Voice Fitting and Synthesis via a Phonological Loop.

torchbearer: A model fitting library for PyTorch

torchbearer: A model fitting library for PyTorch

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

MacroTools provides a library of tools for working with Julia code and expressions.

Numba-accelerated Pythonic implementation of MPDATA with examples in Python, Julia and Matlab

Python and Julia in harmony.

A little software to generate and save Julia or Mandelbrot's Fractals.