InternVL

3wks agoupdate 00

Open MLLM excelling in vision, reasoning, and long context via multimodal pre-training.

Collection time:
2025-01-01
InternVLInternVL

What is InternVL?

InternVL is an Open MLLM family (1B-78B) from OpenGVLab that excels at vision, reasoning, long context & agents via native multimodal pre-training. It outperforms base LLMs on text tasks.


How to use InternVL?

You can ask InternVL questions. Examples include asking what a person is looking at, implementing a flowchart using Python, and relating images to each other.


InternVL’s Core Features

Multimodal pre-training Vision and reasoning capabilities Long context understanding Agent capabilities Outperforms base LLMs on text tasks


InternVL’s Use Cases

  • Answering questions about images
  • Implementing flowcharts using Python
  • Relating different images to each other
  • Identifying mistakes in translations

Relevant Navigation