AntonAntonov/ MonadicLatentSemanticAnalysis

Software monad for Latent semantic analysis

Contributed by: Anton Antonov

Software monad for Latent Semantic Analysis (LSA). Three dimension reduction algorithms are supported Singular Value Decomposition (SVD), Non-Negative Matrix Factorization (NNMF), Independent Component Analysis (ICA).

Installation Instructions

To install this paclet in your Wolfram Language environment, evaluate this code:
PacletInstall["AntonAntonov/MonadicLatentSemanticAnalysis"]


To load the code after installation, evaluate this code:
Needs["AntonAntonov`MonadicLatentSemanticAnalysis`"]

Details

In this paclet the concept "software monad" is seen as a design pattern, suitable for rapid and dynamic building of computational workflows.
The monadic pipelines can be constructed with the douple right arrow symbol, "⟹".
The monad has functions for document-term matrix construction.
The entries of the document-term matrices can be adjusted with typical Latent Semantic Indexing (LSI) weight functions (IDF, TFIDF, Cosine, etc.)
The monad has functions fo extracting and tabulating topics and statistical thesauri.
The monad has functions for graph reprsentations of the document-term matrices and document-topic matrices.

Paclet Guide

Examples

Basic Examples (3) 

Get USA presidents speeches:

In[1]:=
speeches = ResourceData[
   ResourceObject["Presidential Nomination Acceptance Speeches"]];
texts = Normal[speeches[[All, "Text"]]];

Run the most frequently used LSA pipeline:

In[2]:=
(* Evaluate this cell to get the example input *) CloudGet["https://www.wolframcloud.com/obj/e4818f96-39e1-4a19-bcd0-9c1b344d02cd"]

Here is the summary box representation of the LSAMon object:

In[3]:=
lsaObj
Out[3]=

Publisher

Anton Antonov

Disclosures

Version History

  • 1.0.2 – 18 April 2024
  • 1.0.1 – 19 December 2023
  • 1.0.0 – 30 May 2023

License Information

Artistic License 2.0

Paclet Source

Source Metadata

See Also