til-llm-colbertv2.md (998B)
1 # Playing with ColBERTV2 Embeddings and Retrieval 2 3 <time id="post-date">2024-05-09</time> 4 5 <p id="post-excerpt"> 6 There are a lot of embedding models out there for LLMs. 7 ColbertV2 is a neat one. 8 Here are some thoughts and code examples. 9 </p> 10 11 ## ColbertV2 12 13 The way you shove data into any embedding model can make a difference, 14 and ColBERT is no different. 15 I started off just giving it an html file 16 with the entirety of a website ([vimbook's print-site one-pager](https://www.vim-book.org/print_page/)). 17 This had a bunch of junk that wasn't needed, 18 which occasionally affected the 19 20 [sqlite-utils insert-files](https://sqlite-utils.datasette.io/en/stable/cli.html#id43) 21 https://github.com/bclavie/RAGatouille 22 23 Multiline script example: 24 25 ```sh 26 # enable multilib - see link below 27 paru # make sure things are up to date generally 28 paru -S android-tools android-sdk-build-tools # includes adb and other goodies 29 reboot 30 ``` 31 32 Image example: ![Source selection](/images/ncmpcpp-mopidy-selector.png) 33