Abstract
Retrieval-Augmented Generation (RAG) has emerged as a powerful framework for enhancing Large Language Models (LLMs) by incorporating external knowledge through information retrieval (IR) techniques. However, in question-answering tasks, RAG often retrieves documents that are only semantically similar to the query, which may not provide the most relevant information for generating accurate responses. To address this limitation, we propose an improved retrieval pipeline that combines dense vector search with a re-ranking mechanism to more effectively identify and extract highly relevant knowledge from the retrieved content. We evaluated our approach on two Chinese datasets, TTQA and TMMLU+, using 17 different LLMs. Experimental results show that our method improves performance by up to 21.24% over baseline approaches, particularly on two finance-related subsets, after incorporating domain-specific financial regulations to enhance the knowledge base used in the TMMLU+ dataset.
| Original language | English |
|---|---|
| Journal | Proceedings - IEEE International Conference on Advanced Video and Signal-Based Surveillance, AVSS |
| Issue number | 2025 |
| DOIs | |
| Publication status | Published - 2025 |
| Event | 2025 IEEE International Conference on Advanced Visual and Signal-Based Systems, AVSS 2025 - Tainan, Taiwan Duration: 2025 Aug 11 → 2025 Aug 13 |
ASJC Scopus subject areas
- Signal Processing
- Media Technology
- Computer Vision and Pattern Recognition
- Artificial Intelligence