Foojay.io, the Friends Of OpenJDK! - S03 / E56

Vectors in Java Code, Database, and LLMs (#56)

In this Foojay podcast, we enter the world of mathematics by discussing Vectors and how they are crucial for AI and machine learning. As ChatGPT explains: "A Vector is a mathematical structure that holds numerical values. Vectors are fundamental to the field of Artificial Intelligence, as they allow mathematical operations to be performed efficiently and form the basis of many machine learning algorithms." OK, but how are these vectors crucial for the whole Artificial Intelligence evolution?

This is the last podcast of season 3, we're taking a summer break and will be back in September with the release of Java 23 and much more OpenJDK-related topics!

Guests

Jonathan Ellis

Alexander Chatzizacharias

Content

00:00 Introduction of the topic and guests
01:57 What is a Vector?
   https://github.com/openai/tiktoken
   https://arxiv.org/abs/1301.3781
   https://towardsdatascience.com/word2vec-research-paper-explained-205cb7eecc30
   https://github.com/jbellis/jvector 
07:14 Vectors explained as a game
   A fun and absurd introduction to Vector Databases: https://www.youtube.com/watch?v=mQGf9hWTqSw
09:44 Understanding tokenizers
10:40 Do we need dedicated Vector databases?
13:39 Vectors, LLMs and hallucinations
   Crafting your own RAG system: Leveraging 30+ LLMs for enhanced performance by Stephan Janssen: https://www.youtube.com/watch?v=9PX5l4ETn0g
20:40 How LLM and chat interfaces are used in companies
   https://www.bbc.com/travel/article/20240222-air-canada-chatbot-misinformation-what-travellers-should-know
23:45 Indexing all of Wikipedia
   https://foojay.io/today/indexing-all-of-wikipedia-on-a-laptop/
   Demo application: https://jvectordemo.com:8443/
   https://openjdk.org/projects/panama/
27:23 Evolutions in Java for vectors, LLMs, and AI
   Vector API (Eighth Incubator): https://openjdk.org/jeps/469
   Foreign Function & Memory API: https://openjdk.org/jeps/454
32:44 Is the GPU needed for vector use cases?
35:04 Can we already use the incubator Vector API in production?
38:27 Some predictions...
   Colbert project: https://github.com/stanford-futuredata/ColBERT
   https://thenewstack.io/overcoming-the-limits-of-rag-with-colbert/
44:19 Make your vectors smaller to make them more efficient and less expensive
   https://www.sciencedirect.com/topics/engineering/vector-quantization
   https://huggingface.co/blog/embedding-quantization
   https://foojay.io/today/visualizing-brain-computer-interface-data-using-javafx/
   Asteroids 3D in JavaFX made from AI Deep Fake Audio data: https://www.youtube.com/watch?v=vFThM9BoTLg
49:19 Outro

About Foojay.io, the Friends Of OpenJDK!

The podcast of foojay.io, a central resource for the Java community’s daily ​information needs, a place for friends of OpenJDK, ​and a community platform for the Java ecosystem​ — bringing together and helping Java professionals everywhere.

Listen at ...

Follow us ...