Welcome to `itembed`#

This is yet another variation of the well-known word2vec method, proposed by Mikolov et al.¹, applied to unordered sequences, which are commonly referred to as itemsets. The contribution of itembed is twofold:

Modifying the base algorithm to handle unordered sequences, which has an impact on the definition of context windows;
Using the two embedding sets introduced in word2vec for supervised learning.

A similar philosophy is described by Wu et al. in StarSpace² and by Barkan and Koenigstein in item2vec³. itembed uses Numba⁴ to achieve high performances.

Citation#

If you use this software in your work, it would be appreciated if you would cite this tool, for instance using the following Bibtex reference:

@software{itembed,
  author = {Johan Berdat},
  title = {itembed},
  url = {https://github.com/sdsc-innovation/itembed},
  version = {0.5.1},
  date = {2024-02-28},
}

Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. Efficient estimation of word representations in vector space. 2013. arXiv:1301.3781. ↩
Ledell Wu, Adam Fisch, Sumit Chopra, Keith Adams, Antoine Bordes, and Jason Weston. StarSpace: embed all the things! 2017. arXiv:1709.03856. ↩
Oren Barkan and Noam Koenigstein. Item2vec: neural item embedding for collaborative filtering. 2017. arXiv:1603.04259. ↩
Siu Kwan Lam, Antoine Pitrou, and Stanley Seibert. Numba: a LLVM-based python JIT compiler. In Proceedings of the Second Workshop on the LLVM Compiler Infrastructure in HPC, 1–6. 2015. URL: https://doi.org/10.1145/2833157.2833162. ↩

Welcome to itembed#

Citation#

Welcome to `itembed`#