TableLLM

Abstract

We introduce TableLLM, a robust large language model (LLM) with 13 billion parameters, purpose-built for proficiently handling tabular data manipulation tasks, whether they are embedded within documents or spreadsheets, catering to real-world office scenarios. We propose a distant supervision method for training, which comprises a reasoning process extension strategy, aiding in training LLMs to understand reasoning patterns more effectively as well as a cross-way validation strategy, ensuring the quality of the self-created data. To evaluate the performance of TableLLM, we have crafted a benchmark tailored to address both document and spreadsheet formats as well as constructed a well-organized evaluation pipeline capable of handling both scenarios. Thorough evaluations underscore the advantages of TableLLM when compared to various existing general-purpose and tabular data-focused LLMs.

Evaluation Results

We evaluate the code solution generation ability of TableLLM on three benchmarks: WikiSQL, Spider and Self-created table operation benchmark. The text answer generation ability is tested on four benchmarks: WikiTableQuestion (WikiTQ), TAT-QA, FeTaQA and OTTQA. The evaluation result is shown below:

Model	WikiTQ	TAT-QA	FeTaQA	OTTQA	WikiSQL	Spider	Self-created	Average
TaPEX	38.5	–	–	–	83.9	15.0	/	45.8
TaPas	31.5	–	–	–	74.2	23.1	/	42.9
TableLlama	24.0	22.2	20.5	6.4	43.7	9.0	/	20.7
GPT3.5	58.5	72.1	71.2	60.8	81.7	67.4	77.1	69.8
GPT4	74.1	77.1	78.4	69.5	84.0	69.5	77.8	75.8
Llama2-Chat (13B)	48.8	49.6	67.7	61.5	–	–	–	56.9
CodeLlama (13B)	43.4	47.2	57.2	49.7	38.3	21.9	47.6	43.6
Deepseek-Coder (33B)	6.5	11.0	7.1	7.4	72.5	58.4	73.9	33.8
StructGPT (GPT3.5)	52.5	27.5	11.8	14.0	67.8	84.8	/	48.9
Binder (GPT3.5)	61.6	12.8	6.8	5.1	78.6	52.6	/	42.5
DATER (GPT3.5)	53.4	28.4	18.3	13.0	58.2	26.5	/	37.0
TableLLM-7B (Ours)	58.8	66.9	72.6	63.1	86.6	82.6	78.8	72.8
TableLLM-13B (Ours)	62.4	68.2	74.5	62.5	90.7	83.4	80.8	74.7

TableLLM: Enabling Tabular Data Manipulation by LLMs in Real Office Usage Scenarios

Table Manipulation using TableLLM on our platform

Abstract

Overview

Evaluation Results

Contact