tosca: Tools for Statistical Content Analysis
Introduction | Data Preprocessing | Read the Corpus - \texttt | Remove Umlauts and XML/HTML Tags - \texttt | Identifying Duplicates - \texttt | Clean Corpus - \texttt | Generate Wordlist - \texttt | Descriptive Analysis | Generic Functions - \texttt | Visualisation of Corpus over Time - \texttt | Frequency Analysis - \texttt | Write CSV Files - \texttt | Generating Subcorpora | Filter Corpus by Dates - \texttt | Filter Corpus by Wordcount - \texttt | Filter Corpus by Words - \texttt | Latent Dirichlet Allocation | Transform Corpus - \texttt | Performing LDA - \texttt | Validation of LDA Results - \texttt | Clustering of Topics - \texttt | Visualisation of Topics over Time - \texttt | Visualisation of Topic Share over Time - \texttt | Visualisation of Words in Topic over Time - \texttt | Visualisation of Words in Articles allocated to Topics - \texttt | Heatmap of Topics over Time including Clustering - \texttt | Individual Cases Contemplation - \texttt | Example pipeline | Conclusion