{"id":18347,"date":"2020-03-16T08:19:43","date_gmt":"2020-03-16T07:19:43","guid":{"rendered":"https:\/\/www.inovex.de\/blog\/?p=18347"},"modified":"2024-06-20T07:33:00","modified_gmt":"2024-06-20T05:33:00","slug":"socialvistum-visualize-label-clusters-social-mediatexts","status":"publish","type":"post","link":"https:\/\/www.inovex.de\/de\/blog\/socialvistum-visualize-label-clusters-social-mediatexts\/","title":{"rendered":"SocialVisTUM: Visualize and Label Clusters of Social Media Texts"},"content":{"rendered":"<p>Web sources such as social networks, internet forums, and user reviews, provide large amounts of unstructured text data. Due to the steady development of new platforms and the increasing number of internet users, the interest in methods that automatically extract discussed topics in text data has increased in recent years. Organizations and scholars from different fields can utilize such methods to identify patterns and generate new insights. Examples are opinion researchers investigating current opinions on political and societal issues, consumer researchers interested in consumer beliefs about the consumption and production of goods, and marketing managers curious about the public perception of their products and services.<\/p>\n<p>Existing unsupervised methods to detect topics such as clustering and topic modeling represent topics as clusters or word lists (e.g., Blei et al., 2003; Chen et al., 2014). This information is often insufficient to get a comprehensible topic overview because words can be used in different contexts. Moreover, resulting word lists are often incoherent and consist of loosely related words. Additional information, like representative sentences, initial topic labels, and topic correlations, is necessary to gain a deeper understanding. Therefore, I developed a new user-friendly visualization and labeling toolkit called SocialVisTUM as part of my master\u2019s thesis. SocialVisTUM enables users to get a quick and comprehensible visual overview of topics and topic relations for any English text corpus and can be accessed <a href=\"https:\/\/github.com\/MartinKirchhoff\/SocialVisTum\">online.<\/a> Moreover, a variety of features can be used to customize the visualization and change initial topic labels. To exemplify the usage and show one possible use case, let\u2019s test it on a new data set that consists of user comments about organic food. We will use a total of 83.938 sentences from the comment sections of articles about organic food from news websites like The Washington Post and the New York Times.<!--more--><\/p>\n<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_85 counter-hierarchy ez-toc-counter ez-toc-custom ez-toc-container-direction\">\n<div class=\"ez-toc-title-container\"><p class=\"ez-toc-title\" style=\"cursor:inherit\"><\/p>\n<\/div><nav><ul class='ez-toc-list ez-toc-list-level-1 ' ><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/www.inovex.de\/de\/blog\/socialvistum-visualize-label-clusters-social-mediatexts\/#Extracting-Topics\" >Extracting Topics<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/www.inovex.de\/de\/blog\/socialvistum-visualize-label-clusters-social-mediatexts\/#The-SocialVisTUM-Toolkit\" >The SocialVisTUM Toolkit<\/a><ul class='ez-toc-list-level-3' ><li class='ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-3\" href=\"https:\/\/www.inovex.de\/de\/blog\/socialvistum-visualize-label-clusters-social-mediatexts\/#Topics-as-Nodes\" >Topics as Nodes<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-4\" href=\"https:\/\/www.inovex.de\/de\/blog\/socialvistum-visualize-label-clusters-social-mediatexts\/#Topic-Correlations-as-Edges\" >Topic Correlations as Edges<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-5\" href=\"https:\/\/www.inovex.de\/de\/blog\/socialvistum-visualize-label-clusters-social-mediatexts\/#Hiding-Insignificant-Topics\" >Hiding Insignificant Topics<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-6\" href=\"https:\/\/www.inovex.de\/de\/blog\/socialvistum-visualize-label-clusters-social-mediatexts\/#Topic-Inspection\" >Topic Inspection<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-7\" href=\"https:\/\/www.inovex.de\/de\/blog\/socialvistum-visualize-label-clusters-social-mediatexts\/#Automatic-Topic-Labels\" >Automatic Topic Labels<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-8\" href=\"https:\/\/www.inovex.de\/de\/blog\/socialvistum-visualize-label-clusters-social-mediatexts\/#Changing-Labels\" >Changing Labels<\/a><\/li><\/ul><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-9\" href=\"https:\/\/www.inovex.de\/de\/blog\/socialvistum-visualize-label-clusters-social-mediatexts\/#Hyperparameter-Estimation\" >Hyperparameter Estimation<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-10\" href=\"https:\/\/www.inovex.de\/de\/blog\/socialvistum-visualize-label-clusters-social-mediatexts\/#Conclusion\" >Conclusion<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-11\" href=\"https:\/\/www.inovex.de\/de\/blog\/socialvistum-visualize-label-clusters-social-mediatexts\/#Sources\" >Sources<\/a><\/li><\/ul><\/nav><\/div>\n<h2><span class=\"ez-toc-section\" id=\"Extracting-Topics\"><\/span>Extracting Topics<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>We use an existing neural network model (ABAE by He et al, 2017) to extract topics from unlabeled text data. Embeddings (vectors) are used to represent topics and words because the numerical representation allows neural networks to deal with them. Moreover, we can use different notions of similarity, like the dot product and the cosine similarity to compare embeddings and identify possible relations. ABAE uses an attention mechanism (Bahdanau et al., 2014) that learns to focus on the most important words in a sentence. Every sentence\u00a0\\(s \\) is represented by a vector \\(\\textbf{z}_s\\) that is defined as the weighted average of all the word embeddings of that sentence. The weights are attentions calculated based on the contribution of the respective words to the meaning of the sentence and the relevance to the topics. The number of topics \\(\\textit{K}\\) must be specified before training. Accordingly, the topic embeddings are initialized as the resulting centroids of k-means clustering on the word embeddings of the corpus vocabulary and then stacked as topic embedding matrix \\(\\textbf{T} \\). During training, ABAE calculates sentence reconstructions \\(\\textbf{r}_s\\) for every sentence. These are linear combinations of the topic embeddings from \\(\\textbf{T} \\) and defined as<\/p>\n<p><center>\\(\\begin{equation}\\mathbf{r}_s=\\mathbf{T}^{\\top} \\cdot \\mathbf{p}_t\\end{equation}\\),<\/center>where \\(\\textbf{p}_t\\) is the weight vector over \\(\\textit{K}\\) topic embeddings. Each weight corresponds to the probability that the input sentence belongs to the associated topic (see examples in Table 1). \\(\\textbf{p}_t\\) is obtained by reducing the dimension of \\(\\textbf{z}_s\\) to the number of topics \\(\\textit{K}\\) and applying softmax such that<\/p>\n<p><center>\\(\\mathbf{p}_t=softmax\\left(\\mathbf{W} \\cdot \\mathbf{z}_s+\\mathbf{b}\\right)\\),<\/center>where \\(\\textbf{W}\\) (matrix weights) and \\(\\textbf{b}\\) (bias vector) are both trainable parameters. Topic embeddings are updated during training to minimize the reconstruction error \\(J(\\theta)\\) based on the contrastive max-margin objective function:<\/p>\n<p><center>\\(J(\\theta)=\\sum_{s \\in D} \\sum_{i=1}^{m} \\max \\left(0,1-\\mathbf{r}_{s} \\mathbf{z}_s+\\mathbf{r}_s \\mathbf{n}_i\\right)\\)<\/center>For every sentence in the training set \\(\\textit{D}\\), we use \\(\\textit{m}\\) negative sample sentences \\(\\textbf{n}_i\\). The loss function rewards high similarity between the reconstructed sentence \\(\\textbf{r}_s\\) and the original sentence embeddings \\(\\textbf{z}_s\\) and low similarity to negative samples \\(\\textbf{n}_i\\). To reconstruct the original data, relevant parameters like the topic embeddings are adjusted. If we can reconstruct our original data using a few topic embeddings, we successfully created a condensed representation of our data, which focuses on the most important themes.<\/p>\n<h2><span class=\"ez-toc-section\" id=\"The-SocialVisTUM-Toolkit\"><\/span>The SocialVisTUM Toolkit<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>Figure 1 shows a visualization for the organic food data set created by the SocialVisTUM toolkit. Topics are represented as nodes and automatically labeled. The node size increases based on the number of topic occurrences (shown in brackets next to the label). The edges between topics are labeled with the corresponding topic correlation and the link thickness increases with a higher positive correlation. A graph layout based on repelling forces between nodes helps to avoid overlaps, which is especially helpful when many nodes and links are displayed while a second force keeps the graph centered. Moreover, users can move nodes around to get a comprehensible overview.<\/p>\n<figure id=\"attachment_18348\" aria-describedby=\"caption-attachment-18348\" style=\"width: 625px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-18348 size-full\" src=\"https:\/\/www.inovex.de\/wp-content\/uploads\/2020\/03\/topic-graph.png\" alt=\"a socialvistum graph with weighted edges\" width=\"625\" height=\"681\" srcset=\"https:\/\/www.inovex.de\/wp-content\/uploads\/2020\/03\/topic-graph.png 625w, https:\/\/www.inovex.de\/wp-content\/uploads\/2020\/03\/topic-graph-275x300.png 275w, https:\/\/www.inovex.de\/wp-content\/uploads\/2020\/03\/topic-graph-400x436.png 400w, https:\/\/www.inovex.de\/wp-content\/uploads\/2020\/03\/topic-graph-360x392.png 360w\" sizes=\"auto, (max-width: 625px) 100vw, 625px\" \/><figcaption id=\"caption-attachment-18348\" class=\"wp-caption-text\">Figure 1: SocialVisTUM applied to the organic food data set. The topics, their occurrences (in brackets), and respective correlations are shown in the visualization.<\/figcaption><\/figure>\n<h3><span class=\"ez-toc-section\" id=\"Topics-as-Nodes\"><\/span>Topics as Nodes<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>After training the ABAE model, we receive the probability matrix \\(\\textbf{P}_t\\), which contains the probability vectors of every sentence. Each vector entry corresponds to the probability that the input sentence belongs to the associated topic (an example is shown in Table 1). We assign every sentence to its most likely topic based on\u00a0\\(\\textbf{P}_t\\) and then iterate over all sentences in the data set to count the number of topic occurrences. The number of topic occurrences is helpful to give an estimation of the topic importance because frequently discussed themes are usually more important as well.<\/p>\n<table style=\"height: 300px;\" border=\"1\" width=\"540\" align=\"center\">\n<caption style=\"caption-side: bottom;\">Table 1: Probability matrix example with three sentences and three topics: <i>GMO (genetically modified organism)<\/i>, <i>Diseases<\/i>, and <i>Organic<\/i>.<\/caption>\n<tbody>\n<tr style=\"color: #ffffff;\" bgcolor=\"#003c7e\">\n<td style=\"width: 180px;\" align=\"center\">Sentence<\/td>\n<td style=\"width: 120px;\" align=\"center\">GMO<\/td>\n<td style=\"width: 120px;\" align=\"center\">Diseases<\/td>\n<td style=\"width: 120px;\" align=\"center\">Organic<\/td>\n<\/tr>\n<tr>\n<td style=\"width: 180px;\" align=\"center\">&#8222;GMO food is bad&#8220;<\/td>\n<td style=\"width: 120px;\" align=\"center\">0.7<\/td>\n<td style=\"width: 120px;\" align=\"center\">0.2<\/td>\n<td style=\"width: 120px;\" align=\"center\">0.1<\/td>\n<\/tr>\n<tr>\n<td style=\"width: 180px;\" align=\"center\">&#8222;GMO food causes chronic diseases&#8220;<\/td>\n<td style=\"width: 120px;\" align=\"center\">0.45<\/td>\n<td style=\"width: 120px;\" align=\"center\">0.45<\/td>\n<td style=\"width: 120px;\" align=\"center\">0.1<\/td>\n<\/tr>\n<tr>\n<td style=\"width: 180px;\" align=\"center\">&#8222;Organic food is healthy&#8220;<\/td>\n<td style=\"width: 120px;\" align=\"center\">0.1<\/td>\n<td style=\"width: 120px;\" align=\"center\">0.1<\/td>\n<td style=\"width: 120px;\" align=\"center\">0.8<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h3><span class=\"ez-toc-section\" id=\"Topic-Correlations-as-Edges\"><\/span>Topic Correlations as Edges<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>To calculate topic correlations, we iterate over all sentences in \\(\\textbf{P}_t\\) and calculate the Pearson correlation between the probability values of each topic. Thereby, we receive a value in the range of [-1; 1] for every topic combination that specifies the relation between topics. Table 1 shows a small example of \\(\\textbf{P}_t\\) with the associated correlations in Table 2. <i>GMO<\/i> and <i>Diseases<\/i> co-occur and are positively correlated indicating that people often discuss diseases in the context of GMO. Because the topic <i>Organic<\/i> is unlikely to occur when either <i>GMO<\/i> or <i>Diseases<\/i> is discussed, it is dissimilar to both.<\/p>\n<table style=\"height: 300px;\" border=\"1\" width=\"540\" align=\"center\">\n<caption style=\"caption-side: bottom;\">Table 2: Three topics and their respective correlations based on the probability matrix \\(\\textbf{P}_t\\).<\/caption>\n<tbody>\n<tr style=\"color: #ffffff;\" bgcolor=\"#003c7e\">\n<td style=\"width: 135px;\" align=\"center\"><\/td>\n<td style=\"width: 135px;\" align=\"center\">GMO<\/td>\n<td style=\"width: 135px;\" align=\"center\">Diseases<\/td>\n<td style=\"width: 135px;\" align=\"center\">Organic<\/td>\n<\/tr>\n<tr>\n<td style=\"width: 135px; color: #ffffff;\" align=\"center\" bgcolor=\"#003c7e\">GMO<\/td>\n<td style=\"width: 135px;\" align=\"center\">1<\/td>\n<td style=\"width: 135px;\" align=\"center\">0.37<\/td>\n<td style=\"width: 135px;\" align=\"center\">-0.91<\/td>\n<\/tr>\n<tr>\n<td style=\"width: 135px; color: #ffffff;\" align=\"center\" bgcolor=\"#003c7e\">Diseases<\/td>\n<td style=\"width: 135px;\" align=\"center\">0.37<\/td>\n<td style=\"width: 135px;\" align=\"center\">1<\/td>\n<td style=\"width: 135px;\" align=\"center\">-0.72<\/td>\n<\/tr>\n<tr>\n<td style=\"width: 135px; color: #ffffff;\" align=\"center\" bgcolor=\"#003c7e\">Organic<\/td>\n<td style=\"width: 135px;\" align=\"center\">-0.91<\/td>\n<td style=\"width: 135px;\" align=\"center\">-0.72<\/td>\n<td style=\"width: 135px;\" align=\"center\">1<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h3><span class=\"ez-toc-section\" id=\"Hiding-Insignificant-Topics\"><\/span>Hiding Insignificant Topics<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>In our tool, an occurrence threshold slider defines the percentage of sentences that must be about a topic in order to display the associated node. Another slider can be used to set the correlation threshold to define the required positive and\/or negative correlation to display the associated connections. These sliders (see Figure 2) are especially helpful to maintain a clear visualization by limiting the number of shown topics and connections when many of them are available.<\/p>\n<h3><span class=\"ez-toc-section\" id=\"Topic-Inspection\"><\/span>Topic Inspection<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>Users can double-click a node to receive additional information about one topic. Afterwards, only nodes that are connected to the clicked node and the associated links are displayed. Moreover, the most similar words (see Figure 2 on the left) and sentences (see Figure 2 on the right) to the topic are shown. Since topic and word embeddings in ABAE share the same dimensionality, we can calculate the cosine similarity between each topic and every word embedding in our vocabulary. Then, we can use the 10 most similar words to represent each topic, similar to the way traditional topic models like LDA (Blei et al., 2003) represent topics as word distributions. As representative sentences, we select the sentences with the highest probability for each topic based on \\(\\textbf{P}_t\\). Representative sentences are helpful because they put representative words into a context, which helps to understand the underlying theme. To see all nodes and links again, the user can either double-click the same node again or double-click any other node to focus on it instead.<\/p>\n<figure id=\"attachment_18353\" aria-describedby=\"caption-attachment-18353\" style=\"width: 990px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-18353 size-full\" src=\"https:\/\/www.inovex.de\/wp-content\/uploads\/2020\/03\/15C_manual-labels_pesticides_complete.png\" alt=\"a screenshot of the socialvistum GUI\" width=\"990\" height=\"716\" srcset=\"https:\/\/www.inovex.de\/wp-content\/uploads\/2020\/03\/15C_manual-labels_pesticides_complete.png 990w, https:\/\/www.inovex.de\/wp-content\/uploads\/2020\/03\/15C_manual-labels_pesticides_complete-300x217.png 300w, https:\/\/www.inovex.de\/wp-content\/uploads\/2020\/03\/15C_manual-labels_pesticides_complete-768x555.png 768w, https:\/\/www.inovex.de\/wp-content\/uploads\/2020\/03\/15C_manual-labels_pesticides_complete-400x289.png 400w, https:\/\/www.inovex.de\/wp-content\/uploads\/2020\/03\/15C_manual-labels_pesticides_complete-720x520.png 720w, https:\/\/www.inovex.de\/wp-content\/uploads\/2020\/03\/15C_manual-labels_pesticides_complete-360x260.png 360w\" sizes=\"auto, (max-width: 990px) 100vw, 990px\" \/><figcaption id=\"caption-attachment-18353\" class=\"wp-caption-text\">Figure 2: SocialVisTUM after double-clicking the topic <i>pesticides<\/i>. Its most representative words and sentences, and the correlated topics farming, food products, and organic production standards are shown.<\/figcaption><\/figure>\n<h3><span class=\"ez-toc-section\" id=\"Automatic-Topic-Labels\"><\/span>Automatic Topic Labels<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>We use a new approach to label topic nodes automatically because the most similar word rarely serves as a suitable topic label. Initial topic labels are useful because they give users an immediate impression of each topic and eliminate the need to manually inspect representative words and sentences to come up with suitable labels. The labeling approach is based on shared hypernyms, which we identify using representative words and the lexical database WordNet (Miller, 1995). Hypernyms are words whose meaning includes the meaning of other words (hyponyms). For example, the word <i>dairy product<\/i> is a hypernym of the hyponyms <i>yoghurt<\/i> and <i>butter<\/i> because they are both dairy products.<\/p>\n<p>First, we retrieve the hypernym hierarchy for every representative word (see Figure 3) and compare the representative word with every other representative word of the same topic. Next, at each comparison, we save the shared hypernym with the lowest distance to the compared words in the hypernym hierarchy. We only consider hypernyms if their distance to both words is smaller than half the distance of the word to the root hypernym to avoid unspecific labels like <i>entity<\/i> and <i>abstraction<\/i>. Then, we use the hypernym that occurs most often as topic label. If no hypernym can be identified, we use the most representative word instead. In the example shown in Figure 3, we identify <i>dairy product<\/i> as lowest shared hypernym of <i>yoghurt<\/i> and <i>butter<\/i> and <i>food<\/i> as lowest shared hypernym of <i>yoghurt<\/i> and <i>bread<\/i>. Although the root hypernym <i>entity<\/i> is a shared hypernym of the words <i>yoghurt<\/i> and <i>wholesale<\/i>, we do not save it because we only consider hypernyms with a distance smaller than 3.5 (half the distance to the root hypernym) for <i>yoghurt<\/i> and 4.5 for <i>wholesale<\/i>.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-18354\" src=\"https:\/\/www.inovex.de\/wp-content\/uploads\/2020\/03\/updated_hypernym_path.png\" alt=\"stacked hypernym graphs\" width=\"510\" height=\"495\" \/><\/p>\n<p>The quality of the shared hypernym chosen as topic label can be approximated by inspecting the number of its hypernym occurrences (shown in Table 3). Topic labels that occur frequently as shared hypernym are usually suitable (e.g., <i>animal<\/i> (102) and <i>compound<\/i> (91)) in contrast to topic labels that occur rarely (e.g., <i>group action<\/i> (9) or <i>smuckers<\/i> (0)). Thus, we can use the number of hypernym occurrences of each topic to estimate the topic coherence for hyperparameter optimization.<\/p>\n<table style=\"height: 300px;\" border=\"1\" width=\"660\" align=\"center\">\n<caption style=\"caption-side: bottom;\">Table 3: Some topic labels we generated during the experiments. The value next to the topic label denotes how often the label occurs as a shared hypernym. The number of hypernyms in the right column specifies in how many word comparisons any shared hypernym is identified.<\/caption>\n<tbody>\n<tr style=\"color: #ffffff;\" bgcolor=\"#003c7e\">\n<td style=\"width: 220px;\" align=\"center\">Topic Label<\/td>\n<td style=\"width: 220px;\" align=\"center\">Representative Words<\/td>\n<td style=\"width: 220px;\" align=\"center\">Number of Hypernyms<\/td>\n<\/tr>\n<tr>\n<td style=\"width: 220px;\" align=\"center\">animal (102)<\/td>\n<td style=\"width: 220px;\" align=\"center\">insect, ant, habitat, rodent, herbivore<\/td>\n<td style=\"width: 220px;\" align=\"center\">218<\/td>\n<\/tr>\n<tr>\n<td style=\"width: 220px;\" align=\"center\">compound (91)<\/td>\n<td style=\"width: 220px;\" align=\"center\">amino, enzyme, metabolism, pottasium, molecule<\/td>\n<td style=\"width: 220px;\" align=\"center\">158<\/td>\n<\/tr>\n<tr>\n<td style=\"width: 220px;\" align=\"center\">chemical (74)<\/td>\n<td style=\"width: 220px;\" align=\"center\">fungicide, insecticide, weedkiller, preservative, bpa<\/td>\n<td style=\"width: 220px;\" align=\"center\">131<\/td>\n<\/tr>\n<tr>\n<td style=\"width: 220px;\" align=\"center\">systematically (0)<\/td>\n<td style=\"width: 220px;\" align=\"center\">systematically, adequately, cleaned, properly, milked<\/td>\n<td style=\"width: 220px;\" align=\"center\">0<\/td>\n<\/tr>\n<tr>\n<td style=\"width: 220px;\" align=\"center\">smuckers (0)<\/td>\n<td style=\"width: 220px;\" align=\"center\">smuckers, afterall, plz, assoc, fearmonering, 100x<\/td>\n<td style=\"width: 220px;\" align=\"center\">0<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h3><span class=\"ez-toc-section\" id=\"Changing-Labels\"><\/span>Changing Labels<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>To change the label of a topic, the user can click on the associated label of a node. Thereby, a prompt is opened and the user can insert a new topic label. The user can download a JSON file with the updated labels by clicking on the Create file button on the sidebar (Not shown in the screenshot for lack of space).<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Hyperparameter-Estimation\"><\/span>Hyperparameter Estimation<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>ABAE has multiple hyperparameters, such as the number of topics and the vocabulary size, which are crucial for the performance. Our objective is to receive topics that are as meaningful (coherent) as possible. The average coherence score (ACS) is usually used to estimate the topic quality. It compares the number of times two representative words co-occur with the number of times each word occurs overall.<\/p>\n<p>One problem of the coherence score is that coherent topics are not necessarily represented by words that frequently co-occur. As an example consider a topic that has a set of synonyms (e.g., food, meal, dinner) as most representative words. This topic is very coherent because it describes one specific theme but produces a low coherence score because synonyms rarely co-occur in the same sentence. Moreover, the coherence score usually increases based on the vocabulary size (see Table 4) because a large vocabulary tends to result in very specific representative words. Although this sounds like the desired outcome, most representative word lists consist of words with multiple spelling mistakes or rare abbreviations. Therefore, topics can not be interpreted adequately and no good labels can be identified because no words with misspellings are defined in WordNet. The ACS also increases based on the number of topics (see Table 4) although a larger number of topics leads to disproportionately many incoherent topics for our data set.<\/p>\n<p>To solve these issues we define and use a new metric, the average number of shared hypernyms (ANH), to identify suitable model parameters. To calculate the ANH, we first derive all shared hypernyms and their respective occurrences for each topic as done during the automatic topic labeling. Then, we define the ANH as the sum of hypernym occurrences over all topics divided by the number of topics. In contrast to the ACS, the ANH works well for synonyms because synonyms usually share hypernyms and thus result in a high ANH. Moreover, the ANH does not necessarily increase with higher vocabulary size. Using the ANH we found out that a medium-sized vocabulary (about 10.000 words) produces the most coherent topics, which is in line with the manual topic inspection by domain experts. As opposed to ACS, an increasing number of topics does also not necessarily increase the ANH. Table 4 shows an excerpt of the results for varying parameters.<\/p>\n<table style=\"height: 300px;\" border=\"1\" width=\"800\" align=\"center\">\n<caption style=\"caption-side: bottom;\">Table 4: Comparison of the topic coherence based on the average coherence score (ACS) and the average number of shared hypernyms (ANH) for the organic food data set.<\/caption>\n<tbody>\n<tr style=\"color: #ffffff;\" bgcolor=\"#003c7e\">\n<td style=\"width: 200px;\" align=\"center\">Number of Topics<\/td>\n<td style=\"width: 200px;\" align=\"center\">Vocabulary Size<\/td>\n<td style=\"width: 200px;\" align=\"center\">Average Coherence Score (ACS)<\/td>\n<td style=\"width: 200px;\" align=\"center\">Average Number of Hypernyms (ANH)<\/td>\n<\/tr>\n<tr>\n<td style=\"width: 200px;\" rowspan=\"3\" align=\"center\">5<\/td>\n<td style=\"width: 200px;\" align=\"center\">1.000<\/td>\n<td style=\"width: 200px;\" align=\"center\">-1104<\/td>\n<td style=\"width: 200px;\" align=\"center\">28.6<\/td>\n<\/tr>\n<tr>\n<td style=\"width: 200px;\" align=\"center\"><b>10.000<\/b><\/td>\n<td style=\"width: 200px;\" align=\"center\">-765<\/td>\n<td style=\"width: 200px;\" align=\"center\"><b>68.0<\/b><\/td>\n<\/tr>\n<tr>\n<td style=\"width: 200px;\" align=\"center\">18.000<\/td>\n<td style=\"width: 200px;\" align=\"center\">-403<\/td>\n<td style=\"width: 200px;\" align=\"center\">5.2<\/td>\n<\/tr>\n<tr>\n<td style=\"width: 200px;\" rowspan=\"3\" align=\"center\">15<\/td>\n<td style=\"width: 200px;\" align=\"center\">1.000<\/td>\n<td style=\"width: 200px;\" align=\"center\">-366<\/td>\n<td style=\"width: 200px;\" align=\"center\">33.3<\/td>\n<\/tr>\n<tr>\n<td style=\"width: 200px;\" align=\"center\"><b>10.000<\/b><\/td>\n<td style=\"width: 200px;\" align=\"center\">-270<\/td>\n<td style=\"width: 200px;\" align=\"center\"><b>40.0<\/b><\/td>\n<\/tr>\n<tr>\n<td style=\"width: 200px;\" align=\"center\">18.000<\/td>\n<td style=\"width: 200px;\" align=\"center\">-197<\/td>\n<td style=\"width: 200px;\" align=\"center\">33.8<\/td>\n<\/tr>\n<tr>\n<td style=\"width: 200px;\" rowspan=\"3\" align=\"center\">50<\/td>\n<td style=\"width: 200px;\" align=\"center\">1.000<\/td>\n<td style=\"width: 200px;\" align=\"center\">-110<\/td>\n<td style=\"width: 200px;\" align=\"center\">30.4<\/td>\n<\/tr>\n<tr>\n<td style=\"width: 200px;\" align=\"center\"><b>10.000<\/b><\/td>\n<td style=\"width: 200px;\" align=\"center\">-70<\/td>\n<td style=\"width: 200px;\" align=\"center\"><b>51.8<\/b><\/td>\n<\/tr>\n<tr>\n<td style=\"width: 200px;\" align=\"center\">18.000<\/td>\n<td style=\"width: 200px;\" align=\"center\">-54<\/td>\n<td style=\"width: 200px;\" align=\"center\">49.7<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h2><span class=\"ez-toc-section\" id=\"Conclusion\"><\/span>Conclusion<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>We implemented the SocialVisTUM toolkit to give organizations and scholars a quick and comprehensible visual overview of discussed topics in text data. SocialVisTUM can be used for any unlabeled English text corpus and displays relevant topic information in a force-directed graph. To give a detailed topic overview, we extended an existing method to extract topics (ABAE) and generated additional topic information in the form of representative sentences, topic labels, topic occurrences, and topic correlations. Moreover, users can dynamically adjust relevant parameters like the required number of topic occurrences and topic correlation to customize the visualization created by SocialVisTUM. Users can also show or hide topic information by double-clicking on topic nodes and change initial topic labels. To detect coherent topics, we introduced a new metric, the average number of shared hypernyms (ANH), that can be used to identify suitable hyperparameters and measure the quality of topics and topic models.<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Sources\"><\/span>Sources<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<ul>\n<li><a href=\"http:\/\/www.jmlr.org\/papers\/volume3\/blei03a\/blei03a.pdf\">Blei et al., 2003:\u00a0Latent Dirichlet Allocation<\/a><\/li>\n<li><a href=\"https:\/\/www.aclweb.org\/anthology\/P14-1033.pdf\">Chen et al., 2014: Aspect Extraction with Automated Prior Knowledge Learning<\/a><\/li>\n<li><a href=\"https:\/\/www.aclweb.org\/anthology\/P17-1036.pdf\">He et al., 2017: An Unsupervised Neural Attention Model for Aspect Extraction<\/a><\/li>\n<li><a href=\"https:\/\/arxiv.org\/abs\/1409.0473\">Bahdanau et al., 2014: Neural Machine Translation by Jointly Learning to Align and Translate<\/a><\/li>\n<li><a href=\"https:\/\/dl.acm.org\/doi\/10.1145\/219717.219748\">Miller, 1995: WordNet: A Lexical Database for English<\/a><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Web sources such as social networks, internet forums, and user reviews, provide large amounts of unstructured text data. Due to the steady development of new platforms and the increasing number of internet users, the interest in methods that automatically extract discussed topics in text data has increased in recent years. Organizations and scholars from different [&hellip;]<\/p>\n","protected":false},"author":146,"featured_media":18358,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"inline_featured_image":false,"ep_exclude_from_search":false,"footnotes":""},"tags":[509],"service":[76],"coauthors":[{"id":146,"display_name":"Martin Kirchhoff","user_nicename":"mkirchhoff"}],"class_list":["post-18347","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","tag-ai-2","service-artificial-intelligence"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.8 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>SocialVisTUM: Visualize and Label Clusters of Social Media Texts<\/title>\n<meta name=\"description\" content=\"We developed SocialVisTUM, a new user-friendly visualization and labeling toolkit which enables users to get a quick and comprehensible visual overview of topics and topic relations for any English text corpus and can be accessed online. Read on for the details and an interactive demo!\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.inovex.de\/de\/blog\/socialvistum-visualize-label-clusters-social-mediatexts\/\" \/>\n<meta property=\"og:locale\" content=\"de_DE\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"SocialVisTUM: Visualize and Label Clusters of Social Media Texts\" \/>\n<meta property=\"og:description\" content=\"We developed SocialVisTUM, a new user-friendly visualization and labeling toolkit which enables users to get a quick and comprehensible visual overview of topics and topic relations for any English text corpus and can be accessed online. Read on for the details and an interactive demo!\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.inovex.de\/de\/blog\/socialvistum-visualize-label-clusters-social-mediatexts\/\" \/>\n<meta property=\"og:site_name\" content=\"inovex GmbH\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/inovexde\" \/>\n<meta property=\"article:published_time\" content=\"2020-03-16T07:19:43+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2024-06-20T05:33:00+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.inovex.de\/wp-content\/uploads\/2020\/02\/socialvistum.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1920\" \/>\n\t<meta property=\"og:image:height\" content=\"1080\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Martin Kirchhoff\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:image\" content=\"https:\/\/www.inovex.de\/wp-content\/uploads\/2020\/02\/socialvistum-1024x576.png\" \/>\n<meta name=\"twitter:creator\" content=\"@inovexgmbh\" \/>\n<meta name=\"twitter:site\" content=\"@inovexgmbh\" \/>\n<meta name=\"twitter:label1\" content=\"Verfasst von\" \/>\n\t<meta name=\"twitter:data1\" content=\"Martin Kirchhoff\" \/>\n\t<meta name=\"twitter:label2\" content=\"Gesch\u00e4tzte Lesezeit\" \/>\n\t<meta name=\"twitter:data2\" content=\"13\u00a0Minuten\" \/>\n\t<meta name=\"twitter:label3\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data3\" content=\"Martin Kirchhoff\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/blog\\\/socialvistum-visualize-label-clusters-social-mediatexts\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/blog\\\/socialvistum-visualize-label-clusters-social-mediatexts\\\/\"},\"author\":{\"name\":\"Martin Kirchhoff\",\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/#\\\/schema\\\/person\\\/9004c1625e4a8e0a32c44ea259ee1806\"},\"headline\":\"SocialVisTUM: Visualize and Label Clusters of Social Media Texts\",\"datePublished\":\"2020-03-16T07:19:43+00:00\",\"dateModified\":\"2024-06-20T05:33:00+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/blog\\\/socialvistum-visualize-label-clusters-social-mediatexts\\\/\"},\"wordCount\":2596,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/#organization\"},\"image\":{\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/blog\\\/socialvistum-visualize-label-clusters-social-mediatexts\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/www.inovex.de\\\/wp-content\\\/uploads\\\/2020\\\/02\\\/socialvistum.png\",\"keywords\":[\"Ai\"],\"articleSection\":[\"Analytics\",\"English Content\",\"General\"],\"inLanguage\":\"de\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/www.inovex.de\\\/de\\\/blog\\\/socialvistum-visualize-label-clusters-social-mediatexts\\\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/blog\\\/socialvistum-visualize-label-clusters-social-mediatexts\\\/\",\"url\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/blog\\\/socialvistum-visualize-label-clusters-social-mediatexts\\\/\",\"name\":\"SocialVisTUM: Visualize and Label Clusters of Social Media Texts\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/blog\\\/socialvistum-visualize-label-clusters-social-mediatexts\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/blog\\\/socialvistum-visualize-label-clusters-social-mediatexts\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/www.inovex.de\\\/wp-content\\\/uploads\\\/2020\\\/02\\\/socialvistum.png\",\"datePublished\":\"2020-03-16T07:19:43+00:00\",\"dateModified\":\"2024-06-20T05:33:00+00:00\",\"description\":\"We developed SocialVisTUM, a new user-friendly visualization and labeling toolkit which enables users to get a quick and comprehensible visual overview of topics and topic relations for any English text corpus and can be accessed online. Read on for the details and an interactive demo!\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/blog\\\/socialvistum-visualize-label-clusters-social-mediatexts\\\/#breadcrumb\"},\"inLanguage\":\"de\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/www.inovex.de\\\/de\\\/blog\\\/socialvistum-visualize-label-clusters-social-mediatexts\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"de\",\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/blog\\\/socialvistum-visualize-label-clusters-social-mediatexts\\\/#primaryimage\",\"url\":\"https:\\\/\\\/www.inovex.de\\\/wp-content\\\/uploads\\\/2020\\\/02\\\/socialvistum.png\",\"contentUrl\":\"https:\\\/\\\/www.inovex.de\\\/wp-content\\\/uploads\\\/2020\\\/02\\\/socialvistum.png\",\"width\":1920,\"height\":1080,\"caption\":\"A stylized socialvistum graph\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/blog\\\/socialvistum-visualize-label-clusters-social-mediatexts\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"SocialVisTUM: Visualize and Label Clusters of Social Media Texts\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/#website\",\"url\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/\",\"name\":\"inovex GmbH\",\"description\":\"\",\"publisher\":{\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"de\"},{\"@type\":\"Organization\",\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/#organization\",\"name\":\"inovex GmbH\",\"url\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"de\",\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/#\\\/schema\\\/logo\\\/image\\\/\",\"url\":\"https:\\\/\\\/www.inovex.de\\\/wp-content\\\/uploads\\\/2021\\\/03\\\/inovex-logo-16-9-1.png\",\"contentUrl\":\"https:\\\/\\\/www.inovex.de\\\/wp-content\\\/uploads\\\/2021\\\/03\\\/inovex-logo-16-9-1.png\",\"width\":1921,\"height\":1081,\"caption\":\"inovex GmbH\"},\"image\":{\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/#\\\/schema\\\/logo\\\/image\\\/\"},\"sameAs\":[\"https:\\\/\\\/www.facebook.com\\\/inovexde\",\"https:\\\/\\\/x.com\\\/inovexgmbh\",\"https:\\\/\\\/www.instagram.com\\\/inovexlife\\\/\",\"https:\\\/\\\/www.linkedin.com\\\/company\\\/inovex\",\"https:\\\/\\\/www.youtube.com\\\/channel\\\/UC7r66GT14hROB_RQsQBAQUQ\"]},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/#\\\/schema\\\/person\\\/9004c1625e4a8e0a32c44ea259ee1806\",\"name\":\"Martin Kirchhoff\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"de\",\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/45591b6e9b85bcf08c80d989971786fe24e0fd095dcdb69e6f13e5da02708119?s=96&d=retro&r=gbe22160b16ce56a6a3ef6f58d554ebd5\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/45591b6e9b85bcf08c80d989971786fe24e0fd095dcdb69e6f13e5da02708119?s=96&d=retro&r=g\",\"contentUrl\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/45591b6e9b85bcf08c80d989971786fe24e0fd095dcdb69e6f13e5da02708119?s=96&d=retro&r=g\",\"caption\":\"Martin Kirchhoff\"},\"url\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/blog\\\/author\\\/mkirchhoff\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"SocialVisTUM: Visualize and Label Clusters of Social Media Texts","description":"We developed SocialVisTUM, a new user-friendly visualization and labeling toolkit which enables users to get a quick and comprehensible visual overview of topics and topic relations for any English text corpus and can be accessed online. Read on for the details and an interactive demo!","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.inovex.de\/de\/blog\/socialvistum-visualize-label-clusters-social-mediatexts\/","og_locale":"de_DE","og_type":"article","og_title":"SocialVisTUM: Visualize and Label Clusters of Social Media Texts","og_description":"We developed SocialVisTUM, a new user-friendly visualization and labeling toolkit which enables users to get a quick and comprehensible visual overview of topics and topic relations for any English text corpus and can be accessed online. Read on for the details and an interactive demo!","og_url":"https:\/\/www.inovex.de\/de\/blog\/socialvistum-visualize-label-clusters-social-mediatexts\/","og_site_name":"inovex GmbH","article_publisher":"https:\/\/www.facebook.com\/inovexde","article_published_time":"2020-03-16T07:19:43+00:00","article_modified_time":"2024-06-20T05:33:00+00:00","og_image":[{"width":1920,"height":1080,"url":"https:\/\/www.inovex.de\/wp-content\/uploads\/2020\/02\/socialvistum.png","type":"image\/png"}],"author":"Martin Kirchhoff","twitter_card":"summary_large_image","twitter_image":"https:\/\/www.inovex.de\/wp-content\/uploads\/2020\/02\/socialvistum-1024x576.png","twitter_creator":"@inovexgmbh","twitter_site":"@inovexgmbh","twitter_misc":{"Verfasst von":"Martin Kirchhoff","Gesch\u00e4tzte Lesezeit":"13\u00a0Minuten","Written by":"Martin Kirchhoff"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.inovex.de\/de\/blog\/socialvistum-visualize-label-clusters-social-mediatexts\/#article","isPartOf":{"@id":"https:\/\/www.inovex.de\/de\/blog\/socialvistum-visualize-label-clusters-social-mediatexts\/"},"author":{"name":"Martin Kirchhoff","@id":"https:\/\/www.inovex.de\/de\/#\/schema\/person\/9004c1625e4a8e0a32c44ea259ee1806"},"headline":"SocialVisTUM: Visualize and Label Clusters of Social Media Texts","datePublished":"2020-03-16T07:19:43+00:00","dateModified":"2024-06-20T05:33:00+00:00","mainEntityOfPage":{"@id":"https:\/\/www.inovex.de\/de\/blog\/socialvistum-visualize-label-clusters-social-mediatexts\/"},"wordCount":2596,"commentCount":0,"publisher":{"@id":"https:\/\/www.inovex.de\/de\/#organization"},"image":{"@id":"https:\/\/www.inovex.de\/de\/blog\/socialvistum-visualize-label-clusters-social-mediatexts\/#primaryimage"},"thumbnailUrl":"https:\/\/www.inovex.de\/wp-content\/uploads\/2020\/02\/socialvistum.png","keywords":["Ai"],"articleSection":["Analytics","English Content","General"],"inLanguage":"de","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.inovex.de\/de\/blog\/socialvistum-visualize-label-clusters-social-mediatexts\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.inovex.de\/de\/blog\/socialvistum-visualize-label-clusters-social-mediatexts\/","url":"https:\/\/www.inovex.de\/de\/blog\/socialvistum-visualize-label-clusters-social-mediatexts\/","name":"SocialVisTUM: Visualize and Label Clusters of Social Media Texts","isPartOf":{"@id":"https:\/\/www.inovex.de\/de\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.inovex.de\/de\/blog\/socialvistum-visualize-label-clusters-social-mediatexts\/#primaryimage"},"image":{"@id":"https:\/\/www.inovex.de\/de\/blog\/socialvistum-visualize-label-clusters-social-mediatexts\/#primaryimage"},"thumbnailUrl":"https:\/\/www.inovex.de\/wp-content\/uploads\/2020\/02\/socialvistum.png","datePublished":"2020-03-16T07:19:43+00:00","dateModified":"2024-06-20T05:33:00+00:00","description":"We developed SocialVisTUM, a new user-friendly visualization and labeling toolkit which enables users to get a quick and comprehensible visual overview of topics and topic relations for any English text corpus and can be accessed online. Read on for the details and an interactive demo!","breadcrumb":{"@id":"https:\/\/www.inovex.de\/de\/blog\/socialvistum-visualize-label-clusters-social-mediatexts\/#breadcrumb"},"inLanguage":"de","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.inovex.de\/de\/blog\/socialvistum-visualize-label-clusters-social-mediatexts\/"]}]},{"@type":"ImageObject","inLanguage":"de","@id":"https:\/\/www.inovex.de\/de\/blog\/socialvistum-visualize-label-clusters-social-mediatexts\/#primaryimage","url":"https:\/\/www.inovex.de\/wp-content\/uploads\/2020\/02\/socialvistum.png","contentUrl":"https:\/\/www.inovex.de\/wp-content\/uploads\/2020\/02\/socialvistum.png","width":1920,"height":1080,"caption":"A stylized socialvistum graph"},{"@type":"BreadcrumbList","@id":"https:\/\/www.inovex.de\/de\/blog\/socialvistum-visualize-label-clusters-social-mediatexts\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.inovex.de\/de\/"},{"@type":"ListItem","position":2,"name":"SocialVisTUM: Visualize and Label Clusters of Social Media Texts"}]},{"@type":"WebSite","@id":"https:\/\/www.inovex.de\/de\/#website","url":"https:\/\/www.inovex.de\/de\/","name":"inovex GmbH","description":"","publisher":{"@id":"https:\/\/www.inovex.de\/de\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.inovex.de\/de\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"de"},{"@type":"Organization","@id":"https:\/\/www.inovex.de\/de\/#organization","name":"inovex GmbH","url":"https:\/\/www.inovex.de\/de\/","logo":{"@type":"ImageObject","inLanguage":"de","@id":"https:\/\/www.inovex.de\/de\/#\/schema\/logo\/image\/","url":"https:\/\/www.inovex.de\/wp-content\/uploads\/2021\/03\/inovex-logo-16-9-1.png","contentUrl":"https:\/\/www.inovex.de\/wp-content\/uploads\/2021\/03\/inovex-logo-16-9-1.png","width":1921,"height":1081,"caption":"inovex GmbH"},"image":{"@id":"https:\/\/www.inovex.de\/de\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/inovexde","https:\/\/x.com\/inovexgmbh","https:\/\/www.instagram.com\/inovexlife\/","https:\/\/www.linkedin.com\/company\/inovex","https:\/\/www.youtube.com\/channel\/UC7r66GT14hROB_RQsQBAQUQ"]},{"@type":"Person","@id":"https:\/\/www.inovex.de\/de\/#\/schema\/person\/9004c1625e4a8e0a32c44ea259ee1806","name":"Martin Kirchhoff","image":{"@type":"ImageObject","inLanguage":"de","@id":"https:\/\/secure.gravatar.com\/avatar\/45591b6e9b85bcf08c80d989971786fe24e0fd095dcdb69e6f13e5da02708119?s=96&d=retro&r=gbe22160b16ce56a6a3ef6f58d554ebd5","url":"https:\/\/secure.gravatar.com\/avatar\/45591b6e9b85bcf08c80d989971786fe24e0fd095dcdb69e6f13e5da02708119?s=96&d=retro&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/45591b6e9b85bcf08c80d989971786fe24e0fd095dcdb69e6f13e5da02708119?s=96&d=retro&r=g","caption":"Martin Kirchhoff"},"url":"https:\/\/www.inovex.de\/de\/blog\/author\/mkirchhoff\/"}]}},"_links":{"self":[{"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/posts\/18347","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/users\/146"}],"replies":[{"embeddable":true,"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/comments?post=18347"}],"version-history":[{"count":3,"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/posts\/18347\/revisions"}],"predecessor-version":[{"id":54886,"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/posts\/18347\/revisions\/54886"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/media\/18358"}],"wp:attachment":[{"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/media?parent=18347"}],"wp:term":[{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/tags?post=18347"},{"taxonomy":"service","embeddable":true,"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/service?post=18347"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/coauthors?post=18347"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}