TextEssence

TextEssence is a tool for comparative text analysis.

Word embeddings capture patterns in how words are used in collections of text. Concept embeddings do the same for knowledge base concepts.

TextEssence lets you examine the differences between different sets of embeddings, to see what they capture about different corpora.

Train word and concept embeddings on any corpus, no annotations required.
Compare differences in embedding nearest neighbors between different corpora.
Visualize changes in the organization of embedding spaces from different corpora.

Read the Paper

Denis Newman-Griffis, Venkatesh Sivaraman, Adam Perer, Eric Fosler-Lussier, Harry Hochheiser. TextEssence: A Tool for Interactive Analysis of Semantic Shifts Between Corpora. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.

Read it on ACL Anthology

Interface

The web interface for TextEssence has three components:

TextEssence

Comparative corpus linguistics with embeddings

TextEssence is a tool for comparative text analysis.

Read the Paper

Interface

Browse

Inspect

Compare

Data

CORD-19 Dataset

CORD-19 Analysis

Sponsors