TableLLM: Enabling Tabular Data Manipulation by LLMs in Real Office Usage Scenarios

Table Manipulation using TableLLM on our platform

Abstract

We introduce TableLLM, a robust large language model (LLM) with 8 billion parameters, purpose-built for proficiently handling tabular data manipulation tasks, whether they are embedded within documents or spreadsheets, catering to real-world office scenarios. We propose a distant supervision method for training, which comprises a reasoning process extension strategy, aiding in training LLMs to understand reasoning patterns more effectively as well as a cross-way validation strategy, ensuring the quality of the automatically generated data. To evaluate the performance of TableLLM, we have crafted benchmarks tailored to address both document and spreadsheet formats as well as constructed a well-organized evaluation pipeline capable of handling both scenarios. Thorough evaluations underscore the advantages of TableLLM when compared to various existing general-purpose and tabular data-focused LLMs.

Overview

Evaluation Results

We evaluate the code solution generation ability of TableLLM on three benchmarks: WikiSQL, Spider and Self-created table operation benchmark. The text answer generation ability is tested on four benchmarks: WikiTableQuestion (WikiTQ), TAT-QA, FeTaQA. The evaluation result is shown below:

ModelWikiTQTAT-QAFeTaQAWikiSQLSpiderSelf-createdAverage
TaPEX38.583.915.0/45.8
TaPas31.574.223.1/42.9
TableLlama24.022.220.543.79.0/20.7
TableGPT2(7B)77.388.175.663.077.374.476.0
Llama3.1(8B)71.974.383.440.618.843.255.3
GPT3.558.572.171.281.767.477.169.8
GPT4o91.591.594.484.069.577.884.8
CodeLlama (13B)43.447.257.238.321.947.643.6
Deepseek-Coder (33B)6.511.07.172.558.473.933.8
StructGPT (GPT3.5)52.527.511.867.884.8/43.1
Binder (GPT3.5)61.612.86.878.652.6/36.3
DATER (GPT3.5)53.428.418.358.226.5/33.0
TableLLM-8B (Ours)89.189.593.489.681.177.886.7

Contact

If you have any questions, we encourage you to either create Github issues or get in touch with us at zhang2718@ruc.edu.cn, luosijia0906@ruc.edu.cn or zhang-jing@ruc.edu.cn.