Performance Assessment of LALMs and Multi-Modality Models

We evaluate the performance of various LALMs with instruction-following capabilities. These models are either open-sourced or accessible through public APIs, such as SpeechGPT (Zhang et al., 2023), BLSP (Wang et al., 2023a), SALMONN (Tang et al., 2023a), Qwen-AudioChat (Chu et al., 2023), and Qwen-Audio Turbo [3]. Additionally, we consider large multi-modality models with audio understanding abilities like PandaGPT (Su et al., 2023), Macaw-LLM (Lyu et al., 2023), and NExT-GPT (Wu et al., 2023b). Besides, we also incorporate a concatenative approach comprising Whisper-large-v2 (Radford et al., 2023) and GPT-4 Turbo (OpenAI, 2023) for tasks related to speech as a baseline. We evaluate the performance of all these models on both funda

\ Table 3: The comparison of different LALMs on AIR-Bench.

\ Table 4: The success rate of different strategies of matching hypotheses with the golden choices for the foundation benchmark.

\ mental and chat benchmarks, utilizing their latest publicly available checkpoints. In cases of multiple checkpoints, we select the model with the largest parameter size. For all models, we directly follow their default decoding strategies for evaluation.

:::info This paper is available on arxiv under CC BY 4.0 DEED license.

:::

[3] https://help.aliyun.com/zh/dashscope/developerreference/qwen-audio-api

This content originally appeared on HackerNoon and was authored by Benchmarking in Business Technology and Software

Print Share Comment Cite Upload Translate Updates

APA

Benchmarking in Business Technology and Software | Sciencx (2024-10-16T15:13:15+00:00) Performance Assessment of LALMs and Multi-Modality Models. Retrieved from https://www.scien.cx/2024/10/16/performance-assessment-of-lalms-and-multi-modality-models/

MLA

" » Performance Assessment of LALMs and Multi-Modality Models." Benchmarking in Business Technology and Software | Sciencx - Wednesday October 16, 2024, https://www.scien.cx/2024/10/16/performance-assessment-of-lalms-and-multi-modality-models/

HARVARD

Benchmarking in Business Technology and Software | Sciencx Wednesday October 16, 2024 » Performance Assessment of LALMs and Multi-Modality Models., viewed ,<https://www.scien.cx/2024/10/16/performance-assessment-of-lalms-and-multi-modality-models/>

VANCOUVER

Benchmarking in Business Technology and Software | Sciencx - » Performance Assessment of LALMs and Multi-Modality Models. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2024/10/16/performance-assessment-of-lalms-and-multi-modality-models/

CHICAGO

" » Performance Assessment of LALMs and Multi-Modality Models." Benchmarking in Business Technology and Software | Sciencx - Accessed . https://www.scien.cx/2024/10/16/performance-assessment-of-lalms-and-multi-modality-models/

IEEE

" » Performance Assessment of LALMs and Multi-Modality Models." Benchmarking in Business Technology and Software | Sciencx [Online]. Available: https://www.scien.cx/2024/10/16/performance-assessment-of-lalms-and-multi-modality-models/. [Accessed: ]

rf:citation

» Performance Assessment of LALMs and Multi-Modality Models | Benchmarking in Business Technology and Software | Sciencx | https://www.scien.cx/2024/10/16/performance-assessment-of-lalms-and-multi-modality-models/ |

Please log in to upload a file.

There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.

Table of Links

4.1 Models

Related Posts