Mllm - 搜索

约 69,100 个结果

在新选项卡中打开链接

时间不限

github.com
https://github.com › BradyFU › Awesome-Multimodal-Large...
BradyFU/Awesome-Multimodal-Large-Language-Models
We are very proud to launch Video-MME, the first-ever comprehensive evaluation benchmark of MLLMs in Video Analysis! 🌟. It includes short- (< 2min), medium- (4min~15min), and long-term (30min~60min) videos, ranging from 11 seconds to 1 hour. All data are newly collected and annotated by humans, not from any existing video dataset. .
github.com
https://github.com › UbiquitousLearning › mLLM
GitHub - UbiquitousLearning/mllm: Fast Multimodal LLM on ...
mllm reuses many low-level kernel implementation from ggml on ARM CPU. It also utilizes stb and wenet for pre-processing images and audios. mllm also has benefitted from following …
arxiv.org
https://arxiv.org › abs
[2306.13549] A Survey on Multimodal Large Language Models
2023年6月23日 · Recently, Multimodal Large Language Model (MLLM) represented by GPT-4V has been a new rising research hotspot, which uses powerful Large Language Models (LLMs) as a brain to perform multimodal tasks.
medium.com
https://medium.com › @tenyks_blogger › multimodal-large...
Multimodal Large Language Models (MLLMs) transforming ...
2024年6月30日 · This article introduces what is a Multimodal Large Language Model (MLLM) [1], their applications using challenging prompts, and the top models reshaping Computer Vision as we speak.
arxiv.org
https://arxiv.org › abs
[2408.01319] A Comprehensive Review of Multimodal Large ...
2024年8月2日 · In an era defined by the explosive growth of data and rapid technological advancements, Multimodal Large Language Models (MLLMs) stand at the forefront of artificial intelligence (AI) systems.

arxiv.org
https://arxiv.org › abs
[2401.13601] MM-LLMs: Recent Advances in MultiModal Large ...
2024年1月24日 · In the past year, MultiModal Large Language Models (MM-LLMs) have undergone substantial advancements, augmenting off-the-shelf LLMs to support MM inputs or outputs via cost-effective training strategies.
github.com
https://github.com › MLLM-Tool › MLLM-Tool
MLLM-Tool: A Multimodal Large Language Model For ... - GitHub
This repository hosts the code, data and model weight of MLLM-Tool, the first tool agent MLLM that has the ability to perceive visual- and auditory- input information and recommend appropriate tools for multi-modal instructions.
分页
- 1
- 2
- 3
- 4
- 下一页

BradyFU/Awesome-Multimodal-Large-Language-Models

GitHub - UbiquitousLearning/mllm: Fast Multimodal LLM on ...

[2306.13549] A Survey on Multimodal Large Language Models

Multimodal Large Language Models (MLLMs) transforming ...

[2408.01319] A Comprehensive Review of Multimodal Large ...

[2401.13601] MM-LLMs: Recent Advances in MultiModal Large ...

MLLM-Tool: A Multimodal Large Language Model For ... - GitHub