FAISS is fast. Its error messages are not. When something goes wrong you get a C++ exception bubbled up through the Python bindings — cryptic, unhelpful, and missing context. This post covers the four most common FAISS errors, what actually causes each one, and exactly how to fix them.
pip install faiss-cpu numpy
# or for GPU:
# pip install faiss-gpu numpy
import faiss
import numpy as np
d = 128
index = faiss.IndexFlatL2(d)
vectors = np.random.rand(5, d).astype('float32')
index.add(vectors)
query = np.random.rand(1, d).astype('float32')
distances, indices = index.search(query, k=10)
This produces:
RuntimeError: Error in faiss/faiss/impl/index_read.cpp:...
# or silently returns -1 in the indices array for missing neighbors
You might not get a hard crash — FAISS will return -1 as a sentinel value in the indices array for slots where no neighbor exists. Downstream code that uses those indices to look up your original data will then blow up with something like:
IndexError: list index out of range
# or
KeyError: -1
You asked for k=10 nearest neighbors but the index only contains 5 vectors. FAISS cannot return more neighbors than exist. For the unfillable slots it writes -1 into the indices output array. Any code that naively uses those indices — like my_documents[idx] — will crash.
import faiss
import numpy as np
d = 128
index = faiss.IndexFlatL2(d)
vectors = np.random.rand(5, d).astype('float32')
index.add(vectors)
query = np.random.rand(1, d).astype('float32')
# Always cap k at the number of indexed vectors
k = min(10, index.ntotal)
distances, indices = index.search(query, k=k)
# Filter out -1 sentinels defensively even after capping
valid_mask = indices[0] != -1
valid_indices = indices[0][valid_mask]
valid_distances = distances[0][valid_mask]
print(f"Found {len(valid_indices)} neighbors: {valid_indices}")
Use index.ntotal to query how many vectors are in the index at any time. Cap k against it before every search, and filter -1 values defensively regardless — indexes can be updated concurrently in some pipelines.
import faiss
import numpy as np
d = 128
index = faiss.IndexFlatL2(d)
vectors = np.random.rand(10, d).astype('float32')
index.add(vectors)
# Query vector has wrong dimension
query = np.random.rand(1, 64).astype('float32')
distances, indices = index.search(query, k=3)
AssertionError: d == query.shape[-1] (128 != 64)
# or in older FAISS versions:
RuntimeError: Error in faiss/faiss/impl/AuxIndexStructures.cpp:...
query.shape[1] (64) != d (128)
The index was created with dimensionality d=128 but the query vector has only 64 dimensions. FAISS performs dot products between the query and stored vectors — those operations are undefined when dimensions differ. The same error appears when you index.add() a batch of vectors with the wrong shape.
Common causes:
all-MiniLM-L6-v2 at 384d to text-embedding-ada-002 at 1536d) but reused an old indexshape (128, 1) instead of (1, 128)shape (128,) instead of (1, 128)import faiss
import numpy as np
d = 128
index = faiss.IndexFlatL2(d)
vectors = np.random.rand(10, d).astype('float32')
index.add(vectors)
# Wrong: 1D array
raw_query = np.random.rand(d).astype('float32')
# Fix 1: reshape 1D to 2D
query = raw_query.reshape(1, -1)
# Fix 2: assert before searching to get a clear error message
assert query.shape[1] == index.d, (
f"Query dimension {query.shape[1]} does not match index dimension {index.d}. "
f"Re-embed your query with the same model used to build the index."
)
distances, indices = index.search(query, k=3)
print(indices)
To diagnose silently mismatched pipelines, store the embedding model name alongside the index:
import json, faiss
def save_index(index, path, embedding_model: str):
faiss.write_index(index, path + ".faiss")
with open(path + ".meta.json", "w") as f:
json.dump({"d": index.d, "embedding_model": embedding_model}, f)
def load_index(path, expected_model: str):
with open(path + ".meta.json") as f:
meta = json.load(f)
if meta["embedding_model"] != expected_model:
raise ValueError(
f"Index was built with '{meta['embedding_model']}' "
f"but you are using '{expected_model}'. Rebuild the index."
)
return faiss.read_index(path + ".faiss")
import faiss
import numpy as np
d = 128
index = faiss.IndexFlatL2(d)
# Forgot to add vectors
query = np.random.rand(1, d).astype('float32')
distances, indices = index.search(query, k=5)
print(indices) # [[-1 -1 -1 -1 -1]]
print(distances) # [[inf inf inf inf inf]] or [[0. 0. 0. 0. 0.]]
There is no exception — FAISS silently returns all -1 indices. Downstream code crashes later with no obvious link to the empty index.
Searching an empty FAISS index is not an error at the FAISS level. index.ntotal == 0 and there are simply no candidates to return. This happens when:
import faiss
import numpy as np
d = 128
index = faiss.IndexFlatL2(d)
# Simulate population step (e.g., from a database or file)
vectors = np.random.rand(100, d).astype('float32')
index.add(vectors)
def safe_search(index, query: np.ndarray, k: int):
"""Search with guards against empty index and wrong k."""
if index.ntotal == 0:
raise RuntimeError(
"FAISS index is empty. Add vectors before searching. "
"Check your ingestion pipeline for silent failures."
)
if query.ndim == 1:
query = query.reshape(1, -1)
if query.dtype != np.float32:
query = query.astype('float32')
k = min(k, index.ntotal)
distances, indices = index.search(query, k)
# Remove sentinel -1 rows
mask = indices[0] != -1
return distances[0][mask], indices[0][mask]
query = np.random.rand(d).astype('float32')
dists, idxs = safe_search(index, query, k=5)
print(f"Top-{len(idxs)} results: {idxs}")
When loading persisted indexes from disk, always validate after loading:
import faiss
index = faiss.read_index("my_index.faiss")
if index.ntotal == 0:
raise RuntimeError("Loaded index is empty — rebuild from source data.")
print(f"Loaded index with {index.ntotal} vectors of dimension {index.d}")
import faiss
import numpy as np
# Attempting GPU index without checking availability
res = faiss.StandardGpuResources()
d = 128
cpu_index = faiss.IndexFlatL2(d)
gpu_index = faiss.index_cpu_to_gpu(res, 0, cpu_index)
AttributeError: module 'faiss' has no attribute 'StandardGpuResources'
# This means you installed faiss-cpu but are using GPU API
# Or with faiss-gpu installed but no CUDA device:
RuntimeError: CUDA error: no kernel image is available for execution on the device
A subtler crash occurs when you move the index to GPU but then pass a NumPy array on CPU — FAISS handles this transparently in most cases, but certain index types require the query vectors to be explicitly placed:
RuntimeError: Error in faiss/faiss/gpu/...
Vectors must be on GPU for this index type
There are two distinct packages: faiss-cpu and faiss-gpu. They cannot coexist and GPU methods are absent from the CPU package. Even with faiss-gpu installed, CUDA version mismatches between your driver, toolkit, and the compiled FAISS binary cause runtime failures.
import faiss
import numpy as np
d = 128
def build_index(vectors: np.ndarray, use_gpu: bool = False):
vectors = vectors.astype('float32')
d = vectors.shape[1]
cpu_index = faiss.IndexFlatL2(d)
cpu_index.add(vectors)
if not use_gpu:
return cpu_index
# Check GPU availability gracefully
if not hasattr(faiss, 'StandardGpuResources'):
print("faiss-gpu not installed, falling back to CPU.")
return cpu_index
ngpus = faiss.get_num_gpus()
if ngpus == 0:
print("No CUDA GPUs found, falling back to CPU.")
return cpu_index
res = faiss.StandardGpuResources()
gpu_index = faiss.index_cpu_to_gpu(res, 0, cpu_index)
print(f"Index moved to GPU 0 ({ngpus} GPU(s) available)")
return gpu_index
# Usage
vectors = np.random.rand(1000, d).astype('float32')
index = build_index(vectors, use_gpu=True) # gracefully falls back
query = np.random.rand(1, d).astype('float32')
distances, indices = index.search(query, k=5)
print(indices)
To move a GPU index back to CPU (e.g., for persistence — FAISS cannot save GPU indexes directly):
import faiss
# gpu_index is a GpuIndex object
cpu_index = faiss.index_gpu_to_cpu(gpu_index)
faiss.write_index(cpu_index, "index.faiss")
# On reload, move back to GPU
loaded = faiss.read_index("index.faiss")
res = faiss.StandardGpuResources()
gpu_index = faiss.index_cpu_to_gpu(res, 0, loaded)
When FAISS behaves unexpectedly, run through this checklist:
import faiss
import numpy as np
def diagnose_index(index, query: np.ndarray, k: int):
print(f"Index type: {type(index).__name__}")
print(f"Index d: {index.d}")
print(f"Index ntotal: {index.ntotal}")
print(f"Query shape: {query.shape}")
print(f"Query dtype: {query.dtype}")
print(f"k requested: {k}")
print(f"k effective: {min(k, index.ntotal)}")
if index.ntotal == 0:
print("WARNING: index is empty")
if query.ndim == 1:
print("WARNING: query is 1D, needs reshape(1, -1)")
if query.dtype != np.float32:
print(f"WARNING: query dtype is {query.dtype}, FAISS requires float32")
if query.shape[-1] != index.d:
print(f"ERROR: dimension mismatch — query {query.shape[-1]} vs index {index.d}")
FAISS errors almost always fall into one of four categories:
min(k, index.ntotal) and filter -1 sentinelsquery.shape[1] == index.d early; reshape 1D arrays; rebuild the index if you changed embedding modelsindex.ntotal > 0 before searching; guard ingestion stepsfaiss-gpu vs faiss-cpu intentionally; save indexes in CPU form with index_gpu_to_cpuAll of these are easy to guard against with a thin wrapper around index.search() that validates inputs before they reach the C++ layer.