close
Skip to content

anandMohanan/personal-vlm

Repository files navigation

Personal VLM

A learning project for building a custom vision-language model from personal photos.

This is a small pipeline I put together to understand how fine-tuning VLMs works end-to-end. It takes a folder of photos, uses Llava:13b to generate captions, and prepares the data in a format ready for fine-tuning.

About

fine tuning vision language models

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages