Sparse v.s. Dense Vector Discussion on Linear Classifier via @mathieuen @ogrisel @atpassos_ml

1
Mathieu Blondel @mblondel_ml

When the weight vector of a linear classifier is in dense format, you can access weights in O(1).

2010-11-13 17:37:04
Mathieu Blondel @mblondel_ml

When it's in sparse format, there's the overhead of using a hash table or b-tree.

2010-11-13 17:37:29
Mathieu Blondel @mblondel_ml

So sparse-dense dot products are more efficient than sparse-sparse dot products.

2010-11-13 17:37:45
Mathieu Blondel @mblondel_ml

Thus even for L1-induced sparse solutions, I'd go for a dense weight vector. Even w/ hundreds of millions of weights, it can fit in memory.

2010-11-13 17:38:10
Mathieu Blondel @mblondel_ml

@mdreid The trick for frequent scalar-vector products is really smart. I think Leon Bottou got it from Pegasos' paper.

2010-11-13 17:43:44
Mathieu Blondel @mblondel_ml

Bob Carpenter briefly mentions the problem of sparse weight vector in this post http://bit.ly/amLB5j

2010-11-13 17:45:35
Alexandre Passos @atpassos_ml

@mathieuen Not really. You can do a sparse-sparse dot product faster using the merge algorithm from mergesort. It's faster than a dense dp

2010-11-13 19:35:16
Alexandre Passos @atpassos_ml

@mathieuen Not really. You can do a sparse-sparse dot product faster using the merge algorithm from mergesort. It's faster than a dense dp

2010-11-13 19:35:16
Alexandre Passos @atpassos_ml

@mathieuen Say your data is a list of (index,weight) . Get two pointers, one for each vector, if they're equal add to the product, else

2010-11-13 19:36:21
Alexandre Passos @atpassos_ml

@mathieuen Say your data is a list of (index,weight) . Get two pointers, one for each vector, if they're equal add to the product, else

2010-11-13 19:36:21
Alexandre Passos @atpassos_ml

@mathieuen increment pointer to a smaller index and repeat until done. At most you will increment pointers N times.

2010-11-13 19:37:04
Alexandre Passos @atpassos_ml

@mathieuen increment pointer to a smaller index and repeat until done. At most you will increment pointers N times.

2010-11-13 19:37:04
Alexandre Passos @atpassos_ml

@mathieuen And after doing the dot product with another similar pass you can compute the SGD. You can even speed this up with bsearch

2010-11-13 19:44:26
Alexandre Passos @atpassos_ml

@mathieuen And after doing the dot product with another similar pass you can compute the SGD. You can even speed this up with bsearch

2010-11-13 19:44:26
Alexandre Passos @atpassos_ml

@mathieuen (binary search or use a tree to get the next index if one vector is quite dense so you can do this in even less time)

2010-11-13 19:45:14
Alexandre Passos @atpassos_ml

@mathieuen (binary search or use a tree to get the next index if one vector is quite dense so you can do this in even less time)

2010-11-13 19:45:14
Mathieu Blondel @mblondel_ml

@atpassos_ml So the idea is to use the merge algorithm to find the dimensions which are non-zero in both vectors?

2010-11-13 20:07:35
Mathieu Blondel @mblondel_ml

@atpassos_ml In the sparse-dense case, it can be a waste if many of the dim that are non-zero in the feat vect are zero in the weight vect

2010-11-13 20:10:35
Mathieu Blondel @mblondel_ml

@atpassos_ml In the sparse-dense dp, u iterate over the non-zero dim in ur feat vect and retrieve the corresponding dim in the weight vect

2010-11-13 20:15:25
Alexandre Passos @atpassos_ml

@mathieuen Of course, in sparse-dense you just have to look at the sparse vector's entries. I mean in the sparse-sparse case.

2010-11-13 21:28:51
Alexandre Passos @atpassos_ml

@mathieuen Of course, in sparse-dense you just have to look at the sparse vector's entries. I mean in the sparse-sparse case.

2010-11-13 21:28:51
Alexandre Passos @atpassos_ml

@mathieuen My point is that with a clever sparse-sparse product (and the usual dumb sparse-dense product you mentioned) it's way faster

2010-11-13 21:31:29
Alexandre Passos @atpassos_ml

@mathieuen My point is that with a clever sparse-sparse product (and the usual dumb sparse-dense product you mentioned) it's way faster

2010-11-13 21:31:29
Alexandre Passos @atpassos_ml

@mathieuen to use a sparse representation for both the feature vector and the training data.

2010-11-13 21:31:47
Alexandre Passos @atpassos_ml

@mathieuen to use a sparse representation for both the feature vector and the training data.

2010-11-13 21:31:47