POLITECNICO DI BARI - Catalogo dei prodotti della Ricerca

The rapid emergence of pretrained language models (PLMs) has fundamentally transformed Natural Language Processing (NLP). Despite their success, the standard fine-tuning paradigm introduces significant limitations, including knowledge destabilization, high computational overhead, reproducibility challenges, and limited transparency. These constraints impede both research progress and the scalable deployment of PLMs in industrial contexts. This dissertation investigates an alternative approach that leverages the frozen internal representations of PLMs without modifying their parameters. By treating PLMs as fixed computational artifacts, the study examines how their latent representational geometry encodes structured, task-relevant information that can be systematically extracted and utilized. Empirical analyses demonstrate that frozen PLMs inherently encode diverse forms of knowledge. This includes hierarchical semantic relationships useful for fine-grained entity typing and ontology completion, which in turn enhance graph-based relational inference. Their representations also contain signals related to factuality, enabling potential self-assessment of generated content. Beyond semantic information, frozen PLMs are shown to capture affective and evaluative cues relevant to downstream tasks, such as sentiment analysis, and to exhibit zero-shot generalization in recommendation and search tasks. Overall, the findings illustrate that a broad spectrum of latent capabilities can be accessed from PLMs without weight modification. These contributions advance interpretability by linking internal representations to human-interpretable semantics, factuality, and affective dimensions, and promote sustainability by reducing dependence on repeated, resource-intensive fine-tuning. The work supports a paradigm in which the capabilities of PLMs are harnessed transparently, efficiently, and responsibly for future NLP and AI applications.

A journey through the hidden representations of pretrained language models: semantics, factuality and beyond / De Bellis, Alessandro. - ELETTRONICO. - (2026).

A journey through the hidden representations of pretrained language models: semantics, factuality and beyond

De Bellis, Alessandro

2026

Abstract

The rapid emergence of pretrained language models (PLMs) has fundamentally transformed Natural Language Processing (NLP). Despite their success, the standard fine-tuning paradigm introduces significant limitations, including knowledge destabilization, high computational overhead, reproducibility challenges, and limited transparency. These constraints impede both research progress and the scalable deployment of PLMs in industrial contexts. This dissertation investigates an alternative approach that leverages the frozen internal representations of PLMs without modifying their parameters. By treating PLMs as fixed computational artifacts, the study examines how their latent representational geometry encodes structured, task-relevant information that can be systematically extracted and utilized. Empirical analyses demonstrate that frozen PLMs inherently encode diverse forms of knowledge. This includes hierarchical semantic relationships useful for fine-grained entity typing and ontology completion, which in turn enhance graph-based relational inference. Their representations also contain signals related to factuality, enabling potential self-assessment of generated content. Beyond semantic information, frozen PLMs are shown to capture affective and evaluative cues relevant to downstream tasks, such as sentiment analysis, and to exhibit zero-shot generalization in recommendation and search tasks. Overall, the findings illustrate that a broad spectrum of latent capabilities can be accessed from PLMs without weight modification. These contributions advance interpretability by linking internal representations to human-interpretable semantics, factuality, and affective dimensions, and promote sustainability by reducing dependence on repeated, resource-intensive fine-tuning. The work supports a paradigm in which the capabilities of PLMs are harnessed transparently, efficiently, and responsibly for future NLP and AI applications.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di discussione
	
				2026
			
	Parole chiave
	
				pretrained language models; hidden representations; semantics; factuality; interpretability; natural language processing; deep learning; machine learning; artificial intelligence; nlp
			
	Citazione
	
				A journey through the hidden representations of pretrained language models: semantics, factuality and beyond / De Bellis, Alessandro. - ELETTRONICO. - (2026).
			
	Appare nelle tipologie:
	
				5.14 Tesi di dottorato

File in questo prodotto:

File	Dimensione	Formato
38 ciclo-DE BELLIS Alessandro.pdf accesso aperto Descrizione: Tesi di dottorato completa di frontespizio Tipologia: Tesi di dottorato Licenza: Creative commons Dimensione 7.3 MB Formato Adobe PDF Visualizza/Apri	7.3 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11589/295622

Citazioni

ND

ND

social impact