By: Martin Feldkircher (Vienna School of International Studies), Márton Kardos (Aarhus University, Denmark), and Petr Koráb (Text Mining Stories)
1.
Topic modelling has recently progressed in two directions. The improved statistical methods stream of Python packages focuses on more robust, efficient, and preprocessing-free models, producing fewer junk topics (e.g., FASTopic). The other relies on the power of generative language models to extract intuitively understandable topics and their descriptions (e.g., TopicGPT [6], LlooM [5]).
Thanks to research on statistical methods for modelling text representations from transformers, junk topics are the exception rather than the norm in newer models. Meanwhile, novel, LLM-based approaches are challenging our long-standing views about what a topic model is and what it can do. Human-readable topic names and descriptions are now becoming more and more an expected result of a well-designed topic modelling pipeline.
As exciting as these developments are, topic modelling is far from being a solved problem. Neural topic models can be rather unstable and sometimes hard for users to trust because of their black-box nature. LLM-powered methods produce impressive results, but can at times raise questions about trust, due to hallucinations and sensitivity to semantically irrelevant changes in input. This is especially a problem for the banking sector, where (un)certainty is essential. Running large language models is also a huge infrastructural and computational burden, and might end up costing large sums of money even for smaller datasets.
Our previous tutorial provides a detailed introduction to how LLMs enhance traditional topic modeling by automatically labeling topic names. In this article, we combine current topic modeling methods with targeted LLM assistance. In our view, a combination of recent advances in language modeling and classical machine learning can provide users with the best of both worlds: a pipeline that combines the capabilities of large language models with the computational efficiency, trustworthiness, and stability of probabilistic ML.
This article explains three fresh topic-modelling techniques that should be part of the NLP toolkit in 2026. We will figure out:
How to use text prompts to specify what topic models should focus on (i.e., seeded topic models).
How LLM-generated summaries can make topic models more accurate.
How generative models can be used to label topics and provide their descriptions.
How these techniques can be used to gain insights from central banking communication.
We illustrate these on the central bank communication speeches corpus from the European Central Bank. This type of text is long, carefully structured, and highly repetitive — exactly the kind of data where standard topic models struggle and where interpretability is essential. By combining seeded topic modelling with LLM-assisted document summarization and analysis, we show how to extract focused, stable, and economically meaningful topics without compromising transparency or scalability.
2. Example Data
We use the press conference communications of the European Central Bank (ECB) as example text data. Since 2002, the ECB’s Governing Council has met on the first Thursday of each month, and its communication of the meeting’s outcome follows the two-step structure ([2]).
How it works: First, at 13:45 CET, the ECB releases a brief monetary policy decision (MPD) statement, which contains only limited textual information. Second, at 14:30 CET, the ECB President delivers an introductory statement during a press conference. This carefully prepared document explains the rationale behind policy decisions, outlines the ECB’s assessment of economic conditions, and provides guidance on future policy considerations. The introductory statement typically lasts about 15 minutes and is followed by a 45-minute Q&A session.
For this article, we use the introductory statements, scraped directly from the ECB website (released with a flexible data licence). The dataset contains 279 statements, and here is what it looks like:
Image 1: ECB communication dataset. Source: Image by authors.
3. Seeded Topic Modelling
Traditionally, topic models focus on identifying the most informative topics in a dataset. A naive approach practitioners take is to fit a larger model, then, usually manually, filter out topics irrelevant to their data question.
What if you could condition a topic model to only extract relevant topics to your data question? This is precisely what seeded topic modelling is used for.
In some methods, this means selecting a set of keywords that reflect your question. But in the framework we explore in this article, you can specify your interest in free-text using a seed phrase that tells the model what to focus on.
3.1 KeyNMF Model
We will use the cutting-edge contextual KeyNMF topic model ([3]). It is, in many aspects, very similar to older topic models, as it formulates topic discovery in terms of matrix factorization. In other words, when using this model, you assume that topics are latent factors, that your documents contain to a lesser or greater extent, which determine and explain the content of those documents.
KeyNMF is contextual because, unlike older models, it uses context-sensitive transformer representations of text. To understand how seeded modelling works, we need to gain a basic understanding of the model. The modelling process happens in the following steps:
We encode our documents into dense vectors using a sentence-transformer.
We encode the vocabulary of these documents into the same embedding space.
For each document, we extract the top N keywords by taking the words that have the highest cosine similarity to the document embedding.
Word importance for a given document is then the cosine similarity, pruned at zero. These scores are arranged into a keyword matrix, where each row is a document, and columns correspond to words.
The keyword matrix is decomposed into a topic-term matrix and a document-topic matrix using Nonnegative Matrix Factorization.
The general KeyNMF, while perfectly adequate for discovering topics in a corpus, is not the most suitable choice if we need to use the model for a specific question. To make this happen, we first have to specify a seed phrase, a phrase that minimally indicates what we are interested in. For example, when analysing the ECB communication dataset, this could be “Expansion of the Eurozone”.
As sentence-transformers can encode this seed phrase, we can use it to retrieve documents that are relevant to our question:
We encode the seed phrase into the same embedding space as our documents and vocabulary.
To make our model more attentive to documents that contain relevant information, we compute a document relevance score by computing cosine similarity to the seed embedding. We prune, again, at zero.
To exaggerate the seed’s importance, one can apply a seed exponent. This involves raising the document relevance scores to the power of this exponent.
We multiply the keyword matrix’s entries by the document relevance.
We then, as before, use NMF to decompose this, now conditioned, keyword matrix.
The advantages of this approach are that it is:
1) incredibly flexible, and
2) can save a lot of manual work.
Be careful: some embedding models can be sensitive to phrasing and might retrieve different document-importance scores for the same document with a slightly different seed phrase. To deal with this, we recommend that you use one of the paraphrase models from sentence-transformers, because they have deliberately been trained to be phrasing invariant, and produce high-quality topics with KeyNMF.
3.3 How to use Seeded KeyNMF
KeyNMFand its seeded version are available on PyPI in the Turftopic package, in a scikit-learn-compatible form. To specify what you are interested in, simply initialize the model with a seed phrase:
from sentence-transformers import SentenceTransformer
from turftopic import KeyNMF
# Encode documents using a sentence-transformer
encoder = SentenceTransformer("paraphrase-mpnet-base-v2")
embeddings = encoder.encode(documents, show_progress_bar=True)
# Initialize KeyNMF with 4 topics and a seed phrase
model = KeyNMF(
n_components=4,
encoder=encoder,
seed_phrase="Expansion of the Eurozone",
seed_exponent=3.0,
)
# Fit model
model.fit(corpus)
# Print modelled topics
model.print_topics()
We can see that the model returns topic IDs with typical keywords that are clearly related to the Euro and the Eurozone:
Image 3: Seed KeyNMF model output.Source: image by authors.
4. LLM-assisted Topic Modeling
Finding interpretable topics from a corpus is a difficult task, and it often requires more than just a statistical model that finds patterns in the raw data. LLMs serve topic modelling in two main areas:
Reading a document and identifying the right aspects in the text based on a specific data question.
Interpreting the topic model’s output in the relevant context.
In the following text, we will now explore 1) how LLMs improve processing documents for a topic model and 2) how generative models improve understanding and interpreting the model results.
One of the Achilles’ heels of the sentence transformers we frequently use for topic analysis is their short context length. Encoder models that can read considerably longer contexts have rarely been evaluated for their performance in topic modeling. Therefore, we didn’t know whether or how these larger transformer models work in a topic modelling pipeline. Another issue is that they produce higher-dimensional embeddings, which often negatively affect unsupervised machine learning models ([4]). It can either be because Euclidean distances get inflated in higher-dimensional space, or because the number of parameters surges with input dimensionality, making parameter recovery more difficult.
We can solve these issues by:
Chunking documents into smaller sections that fit into the context window of a sentence transformer. Unfortunately, chunking can result in text chunks that are wildly out of context, and it might take considerable effort to chunk documents at semantically sensible boundaries.
Using generative models to summarize the contents of these documents. LLMs excel at this task and can also remove all types of tokenization-based noise and irrelevant information from texts that might hinder our topic model.
Let’s now summarise the trade-offs of using LLM-generated summaries in topic modelling in the following image.
Image 5: Benefits and drawbacks of LLM-assisted document processing in the topic modelling pipeline. Source: image by authors.
The recommended strategy for LLM-assisted document preprocessing is a two-step:
Train a topic model with simple preprocessing, or no preprocessing at all.
When you find that topic models have a hard time interpreting your corpus, using LLM-based summarisation can be a good choice if the trade-offs work positively in your specific project.
4.1.1. Document Summarization in Code
Let’s now look at how we can summarize documents using an LLM. In this example, we will use GPT-5-nano, but Turftopic also allows running locally run open LLMs. We recommend using open LLMs locally, if possible, due to lower costs and better data privacy.
import pandas as pd
from tqdm import tqdm
from turftopic.analyzers import OpenAIAnalyzer, LLMAnalyzer
# Loading the data
data = pd.read_parquet("data/ecb_data.parquet")
content = list(data["content"])
# We write a prompt that will extract the relevant information
# We ask the model to separate information to key points so that
# they become easier to model
summary_prompt="Summarize the following press conference from
the European Central Bank into a set of key points separated by
two newline characters. Reply with the summary only, nothing else.
\n {document}"
# Formalize a summarized
summarizer = OpenAIAnalyzer("gpt-5-nano", summary_prompt=summary_prompt)
summaries = []
# Summarize dataframe, track code execution
for document in tqdm(data["content"], desc="Summarising documents..."):
summary = summarizer.summarize_document(document)
# We print summaries as we go as a sanity check, to make sure
# the prompt works
print(summary)
summaries.append(summary)
# Collect summaries into a dataframe
summary_df = pd.DataFrame(
{
"id": data["id"],
"date": data["date"],
"author": data["author"],
"title": data["title"],
"summary": summaries,
}
)
Next, we will fit a simple KeyNMF model on the key points in these summaries, and let the model discover the number of topics using the Bayesian Information Criterion. This approach works very well in this case, but be careful that automatic topic number detection has its shortcomings. Check out the Topic Model Leaderboard to gain more information on how models perform at detecting the number of topics.
import numpy as np
import pandas as pd
from sentence_transformers import SentenceTransformer
from turftopic import KeyNMF
# Create corpus from text summaries (not original texts)
corpus = list(summary_df["summary"])
# Collect key points by segmenting at double line breaks
points = []
for doc in corpus:
_points = doc.split("\n\n")
doc_points = [p for p in _points if len(p.strip().removeprefix(" - "))]
points.extend(doc_points)
# Tell KeyNMF to automatically detect the number of topics using BIC
model = KeyNMF("auto", encoder="paraphrase-mpnet-base-v2")
doc_topic = model.fit_transform(points)
# Print topic IDs with top words
model.print_topics()
Here are the KeyNMF results trained on document summaries:
Image 6: KeyNMF 10-topic results trained on document summaries. Source: image by authors.
4.3. Topic Analysis with LLMs
In a typical topic-analysis pipeline, a user would first train a topic model, then spend time interpreting what the model has discovered, label topics manually, and finally provide a brief description of the types of documents the topic contains. This is time-consuming, especially in corpora with many identified topics.
This part can now be done by LLMs that can easily generate human-readable topic names and descriptions. We will use the same Analyzer API from Turftopic to achieve this:
We apply the analyzer to the introductory statements issued by the ECB, which accompany each monetary policy decision. These statements are prepared carefully and follow a relatively standard structure. Here are the labelled topic names with their descriptions and top words printed from analysis_result:
Image 7: Topic Analysis using GPT-5-nano in Turftopic. Source: image by authors.
Next, let’s show the prevalence of the labelled KeyNMF’ topic names over time. It’s how intensely these topics were discussed in the ECB press conferences during the last 25 years:
from datetime import datetime
import plotly.express as px
from scipy.signal import savgol_filter
# create dataframe from labelled topics,
# combine with timestamp from date column
time_df = pd.DataFrame(
dict(
date=timestamps,
**dict(zip(analysis_result.topic_names, doc_topic.T /
doc_topic.sum(axis=1)))
)
).set_index("date")
# group dataframe to monthly frequency
time_df = time_df.groupby(by=[time_df.index.month, time_df.index.year]).mean()
time_df.index = [datetime(year=y, month=m, day=1) for m, y in time_df.index]
time_df = time_df.sort_index()
# display dataframe with Plotly
for col in time_df.columns:
time_df[col] = savgol_filter(time_df[col], 12, 2)
fig = px.line(
time_df.sort_index(),
template="plotly_white",
)
fig.show()
Here is the labelled topic model dataframe displayed in yearly frequency:
Image 8: Topic Analysis using GPT-5-nano in Turftopic over time. Source: Image by authors.
Model results in context: The monetary union topic was most prominent in the early 2000s (see [5] for more information). The monetary policy and rate decision topic peaks at the end of the global financial crisis around 2011, a period during which the ECB (some commentators argue mistakenly) raised interest rates. The timing of the inflation and inflation expectations topic also corresponds with economic developments: it rises sharply around 2022, when energy prices pushed inflation into double-digit territory in the euro area for the first time since its creation.
5. Summary
Let’s now summarize the key points of the article. The requirements and code for this tutorial are in this repo.
SeededKeyNMF topic model combines text prompts with the latest topic model to concentrate modelling on a certain problem.
Summarizing data for topic modeling reduces training time, but it has drawbacks that should be considered in a project.
The Tutftopic Python package implements systematic descriptions and labels with recent LLMs into a topic modelling pipeline.
[2] Carlo Altavilla, Luca Brugnolini, Refet S. Gürkaynak, Roberto Motto and Giuseppe Ragusa. 2019. Measuring euro area monetary policy. In: Journal of Monetary Economics, Volume 108, pp 162-179.
[3] Ross Deans Kristensen-McLachlan, Rebecca M.M. Hicke, Márton Kardos, and Mette Thunø. 2024. Context is Key(NMF): Modelling Topical Information Dynamics in Chinese Diaspora Media. In: CHR 2024: Computational Humanities Research Conference, December 4–6, 2024, Aarhus, Denmark.
[4] Márton Kardos, Jan Kostkan, Kenneth Enevoldsen, Arnault-Quentin Vermillet, Kristoffer Nielbo, and Roberta Rocca. 2025. S3 – Semantic Signal Separation. In: Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 633–666, Vienna, Austria. Association for Computational Linguistics.
[6] Michelle S. Lam, Janice Teoh, James A. Landay, Jeffrey Heer, and Michael S. Bernstein. 2024. Concept Induction: Analyzing Unstructured Text with High-Level Concepts Using LLooM. In: Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems (CHI ’24). Association for Computing Machinery, New York, NY, USA, Article 766, 1–28. https://doi.org/10.1145/3613904.3642830.
[7] Chau Minh Pham, Alexander Hoyle, Simeng Sun, Philip Resnik, and Mohit Iyyer. 2024. TopicGPT: A Prompt-based Topic Modeling Framework. In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 2956–2984, Mexico City, Mexico. Association for Computational Linguistics.