Create vector fields in your dataset to power AI features

Vectors unlock AI discovery

Vectors are an AI data format powering our advanced discovery features like AI search and recommendation.

Vectorization uses machine learning to extract the meaning of your content into a large array containing thousands of numbers. Each number represents a different dimension to the meaning of the content.

To create vector fields, head to the Dashboard, goto your dataset and select Vectorize. Set up your vectorization and click Continue to create a token. You'll be redirected to a preconfigured Google Colab notebook where you can paste that token. This will start your vectorization job.

Make sure to keep this tab open until your vectorization has finished!

Read more about the different vector models and their use cases.

What to vectorize

Vectorize the fields that best represent the meaning of your content. If text, this means fields where:

  • They contain content that has inherent meaning. Fields such as "first_name", "insert_date" or "location" are best handled by normal search or filtering, as you want to match them literally.
  • They contain enough content to derive meaning from. Usually this means, the field contains at least one sentence. It's hard to extract useful meaning into a vector from just one or two words.
  • They don't contain so much content as to dilute meaning. If you encode a field that contains many paragraphs, there may not be a clear underlying meaning to extract. This will dilute the utility of the vector.

Once you have vector fields, you will be able to take advantage of AI search and recommendations.

Make sure to re-run vectorizations as you add new data to your dataset.