Colpali Log

2025-2-8

Implemented the Lancedb indexing.

Big issue in modal container and lancedb:

If we do the indexing on modal with lancedb, we will get indeies but encounter `


RuntimeError: lance error: LanceError(IO): Execution error: ExecNode(Take): thread panicked: task 25 panicked with message "called Result::unwrap() on an Err value: JoinError::Panic(Id(150),\"called Option::unwrap() on a None value\", ...)"

We got no choice, so we file a issue: bug(python): tbl.create_index(metric="cosine") causes Rust panic in Modal container, but works locally · Issue #2105 · lancedb/lancedb

2025-2-9

Reorganized the code. Now the code is clean.

Tested RuntimeError again by copying the local Indies to modal container. Still get same Error.

Next, try quantization.

Insight: Indexing is crucial here. In the Colpali case, indexing does not reduce accuracy. Additionally, even a single image can be indexed effectively since it generates 1,030 vectors, providing sufficient data for PQ (Product Quantization) to learn features. The more images available, the better the indexing performance, approaching 99.999% of native MaxSim computation accuracy.

2025-2-10

Test indexing performance on RTX 4090

Results for query: 'What is Sushan Wild party?'

Indexing	time	result
No	0.035	data_test_1.png, Distance: 21.97092056274414 Exploring_the_Limits_of_Language_Modeling_page_10.png, Distance: 22.033401489257812
Yes	0.019	data_test_1.png, Distance: 10.978950500488281 Generating Sequences_With_Recurrent_Neural_Networks_page_31.png, Distance: 13.490697860717773

This looks good, but each time indexing is different. Most of time the indexing will messing result. We need a solution for it.

2025-2-11

Went through the theoretical pipeline of how to build best ColPali.

Make a plan:

Stage 1

Native ColPali Implementation
at 2025-2-12

Stage 2

Hybrid (HNSW + Rerank)
at 2025-2-13

Stage 3

Hybrid + Binary Quantization

2025-2-12

Native ColPali finished. test result:
dataset: test-pdfs

Query 1: What is Sushan Wild party?
Query 2: Which party got more women in 112th?
Query 3: Who are transformers paper's authors?

Methods	Query	Score	Time
Native	Q1	data_test_1.png (score: 10.004) gao-25-900570_page_74.png (score: 7.956)	0.28765s
Native	Q2	data_test_2.png (score: 17.327) data_test_1.png (score: 14.918)	0.08996s
Native	Q3	gao-25-900570_page_24.png (score: 9.214) gao-25-900570_page_7.png (score: 7.922)	0.09184s

2025-2-13

Hybrid (HNSW + Rerank)

Problem 1: upsert() of Qdrant has uploading limit which is 17. So we only process with for loop to upsert embeddings.

Today I implemented: HNSW, mean_pooling_columns and mean_pooling_rows and get prefetch.

Some questions has been answered:

Q1: How prefetch Works in Qdrant?

search_queries.append(
    QueryRequest(
        query=q_embedding,
        prefetch=[
            Prefetch(query=q_embedding, limit=200, using="mean_pooling_columns"),
            Prefetch(query=q_embedding, limit=200, using="mean_pooling_rows")
        ],
        limit=top_k,
        with_payload=True,
        with_vector=False,
        using="original"
    )
)

This means:
After Qdrant finds the top top_k matches from "original", it also fetches up to 200 entries from "mean_pooling_columns" and "mean_pooling_rows" that are related to those results.

Primary Search (using="original")
Your search is performed only on "original", meaning that Qdrant finds the most similar vectors in that space.
Prefetching (prefetch=[...])
After Qdrant retrieves the best matching points (data entries) from "original", it also fetches their related embeddings from "mean_pooling_columns" and "mean_pooling_rows". Prefetched vectors are not used for ranking but can be useful for additional processing.

Q2: Why Use prefetch with mean_pooling_columns and mean_pooling_rows?

How prefetch work?

2025-2-14

81 Imges

Search with prefetch

pipeline.search_with_text_queries.remote(queries, prefetch_size=20, top_k=3)

Query: What is Sushan Wild party?

data_test_1.png (score: 10.005)
Unifying_Multimodal_Retrieval_via_DSE_page_9.png (score: 9.797)
Exploring_the_Limits_of_Language_Modeling_page_11.png (score: 9.752)
Query: Which party got more women in 112th?
data_test_2.png (score: 17.327)
data_test_1.png (score: 14.920)
Generating Sequences_With_Recurrent_Neural_Networks_page_29.png (score: 9.291)
Query: Who are transformers paper's authors?
attention_is_all_you_need_page_7.png (score: 12.354)
attention_is_all_you_need_page_9.png (score: 11.396)
attention_is_all_you_need_page_4.png (score: 11.261)

Search time: 0.12687s

Search without prefetch

pipeline.search_without_prefetch.remote(queries, top_k=3)

Query: What is Sushan Wild party?

data_test_1.png (score: 10.005)
Unifying_Multimodal_Retrieval_via_DSE_page_9.png (score: 9.797)
Exploring_the_Limits_of_Language_Modeling_page_11.png (score: 9.752)
Query: Which party got more women in 112th?
data_test_2.png (score: 17.327)
data_test_1.png (score: 14.920)
in_context_scheming_reasoning_paper_page_13.png (score: 9.379)
Query: Who are transformers paper's authors?
attention_is_all_you_need_page_7.png (score: 12.354)
attention_is_all_you_need_page_9.png (score: 11.396)
attention_is_all_you_need_page_4.png (score: 11.261)

HNSW

hnsw_config=HnswConfigDiff(m=0) # HNSW switched off

Number of neighbours to consider during the index building. Larger the value - more accurate the search, more time required to build index.

Binary Quantization

Query: What is Sushan Wild party?

Exploring_the_Limits_of_Language_Modeling_page_10.png (score: 1272.000)
Exploring_the_Limits_of_Language_Modeling_page_11.png (score: 1272.000)

Query: Which party got more women in 112th?

data_test_2.png (score: 1840.000)
data_test_1.png (score: 1522.000)

Query: Who are transformers paper's authors?

attention_is_all_you_need_page_7.png (score: 1396.000)
attention_is_all_you_need_page_9.png (score: 1396.000)

Search time: 0.19166s

The result is not that perfect. BQ will change the accuracy of MaxSim searching.

2025-2-25

“Full Model vs. State Dict” in PyTorch

Finished classifier training.

2025-3-2

curl -X POST "https://tu-zhenzhao--stylemi-app-v2-api-service.modal.run/search?amount=5"
-H "Content-Type: multipart/form-data"
-F "file=@nail/183491c3-519c-446a-92d2-0e0955ff1eba.jpg"

2025-4-6

compared two classify_images_async

Embedding one: 30mins
pre-embeded one: 10s

For 1600 images

2025-4-23

Page not found · GitHub · GitHub

This code can runs large image dataset. Try to upload 20k images.

For 1000 batch: Memory 6.97GB, Low 3.41GB

Each hour with $2.03, ~4k images

$5 for 8k images

2025-5-1

Iteration 1

Using original method for reranking with prefilter on for final_df (.where(f"id IN {tuple_ids}", prefilter=True))

Direct Search	Reranking Search
Query: batman say 'Good shot robin, and now we'll see who our masked mystery man is' Filename: 1_page_272.png Score : 0.6503 Duration: 21.34 s ---------------------------------------- Filename: 2_page_60.png Score : 0.6483 Duration: 21.34 s ---------------------------------------- Filename: 6_page_224.png Score : 0.6477 Duration: 21.34 s ---------------------------------------- Filename: 1_page_268.png Score : 0.6462 Duration: 21.34 s ---------------------------------------- Filename: 1_page_234.png Score : 0.6431 Duration: 21.34 s ---------------------------------------- = Query: Limit of LLM = Filename: Captain America vol 1 383 (1991) (c2ce-dcp)_page_3.png Score : 0.8310 Duration: 20.16 s ---------------------------------------- Filename: Captain America vol 1 415 (1993) (c2ce-dcp)_page_8.png Score : 0.8257 Duration: 20.16 s ---------------------------------------- Filename: Captain America vol 1 412 (1993) (c2ce-dcp)_page_12.png Score : 0.8245 Duration: 20.16 s ---------------------------------------- Filename: Captain America vol 1 407 (1992) (c2ce-dcp)_page_15.png Score : 0.8162 Duration: 20.16 s ---------------------------------------- Filename: Captain America vol 1 382 (1991) (c2ce-dcp)_page_3.png Score : 0.8138 Duration: 20.16 s ----------------------------------------	Query: batman say 'Good shot robin, and now we'll see who our masked mystery man is' Filename: 1_page_303.png Score : 0.6543 Duration: 29.04 s ---------------------------------------- Filename: 4_page_186.png Score : 0.6540 Duration: 29.04 s ---------------------------------------- Filename: 1_page_315.png Score : 0.6514 Duration: 29.04 s ---------------------------------------- Filename: 1_page_268.png Score : 0.6462 Duration: 29.04 s ---------------------------------------- Filename: 1_page_234.png Score : 0.6431 Duration: 29.04 s ---------------------------------------- = Query: Limit of LLM = Filename: Marvel Universe v1 004_page_3.png Score : 0.9017 Duration: 31.92 s ---------------------------------------- Filename: Marvel Universe v1 005_page_4.png Score : 0.8899 Duration: 31.92 s ---------------------------------------- Filename: Marvel Universe v1 001_page_3.png Score : 0.8888 Duration: 31.92 s ---------------------------------------- Filename: 352.08 Spider-Man V1 #19 (Digital)_page_14.png Score : 0.8878 Duration: 31.92 s ---------------------------------------- Filename: Captain America vol 1 263 (1981) (c2ce) (Mazen-DCP)_page_22.png Score : 0.8454 Duration: 31.92 s ----------------------------------------

Direct Search

Reranking Search

Query: batman say 'Good shot robin, and now we'll see who our masked mystery man is'
Filename: 1_page_272.png
Score : 0.6503
Duration: 21.34 s
----------------------------------------
Filename: 2_page_60.png
Score : 0.6483
Duration: 21.34 s
----------------------------------------
Filename: 6_page_224.png
Score : 0.6477
Duration: 21.34 s
----------------------------------------
Filename: 1_page_268.png
Score : 0.6462
Duration: 21.34 s
----------------------------------------
Filename: 1_page_234.png
Score : 0.6431
Duration: 21.34 s
----------------------------------------

= Query: Limit of LLM =
Filename: Captain America vol 1 383 (1991) (c2ce-dcp)_page_3.png
Score : 0.8310
Duration: 20.16 s
----------------------------------------
Filename: Captain America vol 1 415 (1993) (c2ce-dcp)_page_8.png
Score : 0.8257
Duration: 20.16 s
----------------------------------------
Filename: Captain America vol 1 412 (1993) (c2ce-dcp)_page_12.png
Score : 0.8245
Duration: 20.16 s
----------------------------------------
Filename: Captain America vol 1 407 (1992) (c2ce-dcp)_page_15.png
Score : 0.8162
Duration: 20.16 s
----------------------------------------
Filename: Captain America vol 1 382 (1991) (c2ce-dcp)_page_3.png
Score : 0.8138
Duration: 20.16 s
----------------------------------------

Query: batman say 'Good shot robin, and now we'll see who our masked mystery man is'
Filename: 1_page_303.png
Score : 0.6543
Duration: 29.04 s
----------------------------------------
Filename: 4_page_186.png
Score : 0.6540
Duration: 29.04 s
----------------------------------------
Filename: 1_page_315.png
Score : 0.6514
Duration: 29.04 s
----------------------------------------
Filename: 1_page_268.png
Score : 0.6462
Duration: 29.04 s
----------------------------------------
Filename: 1_page_234.png
Score : 0.6431
Duration: 29.04 s
----------------------------------------

= Query: Limit of LLM =
Filename: Marvel Universe v1 004_page_3.png
Score : 0.9017
Duration: 31.92 s
----------------------------------------
Filename: Marvel Universe v1 005_page_4.png
Score : 0.8899
Duration: 31.92 s
----------------------------------------
Filename: Marvel Universe v1 001_page_3.png
Score : 0.8888
Duration: 31.92 s
----------------------------------------
Filename: 352.08 Spider-Man V1 #19 (Digital)_page_14.png
Score : 0.8878
Duration: 31.92 s
----------------------------------------
Filename: Captain America vol 1 263 (1981) (c2ce) (Mazen-DCP)_page_22.png
Score : 0.8454
Duration: 31.92 s
----------------------------------------

Iteration 2

Now indexed pooling_rows and pooling_cols , but no scaler and no original indexing.

Direct Search	Reranking Search
Query: batman say 'Good shot robin, and now we'll see who our masked mystery man is' Filename: 1_page_272.png Score : 0.6503 Duration: 21.34 s ---------------------------------------- Filename: 2_page_60.png Score : 0.6483 Duration: 21.34 s ---------------------------------------- Filename: 6_page_224.png Score : 0.6477 Duration: 21.34 s ---------------------------------------- Filename: 1_page_268.png Score : 0.6462 Duration: 21.34 s ---------------------------------------- Filename: 1_page_234.png Score : 0.6431 Duration: 21.34 s ---------------------------------------- = Query: Limit of LLM = Filename: Captain America vol 1 383 (1991) (c2ce-dcp)_page_3.png Score : 0.8310 Duration: 20.16 s ---------------------------------------- Filename: Captain America vol 1 415 (1993) (c2ce-dcp)_page_8.png Score : 0.8257 Duration: 20.16 s ---------------------------------------- Filename: Captain America vol 1 412 (1993) (c2ce-dcp)_page_12.png Score : 0.8245 Duration: 20.16 s ---------------------------------------- Filename: Captain America vol 1 407 (1992) (c2ce-dcp)_page_15.png Score : 0.8162 Duration: 20.16 s ---------------------------------------- Filename: Captain America vol 1 382 (1991) (c2ce-dcp)_page_3.png Score : 0.8138 Duration: 20.16 s	= Query: batman say 'Good shot robin, and now we'll see who our masked mystery man is' = Filename: 3_page_235.png Score : 0.6637 Duration: 25.72 s ---------------------------------------- Filename: 3_page_250.png Score : 0.6624 Duration: 25.72 s ---------------------------------------- Filename: 3_page_20.png Score : 0.6591 Duration: 25.72 s ---------------------------------------- Filename: 3_page_234.png Score : 0.6581 Duration: 25.72 s ---------------------------------------- Filename: 1_page_234.png Score : 0.6431 Duration: 25.72 s ---------------------------------------- = Query: Limit of LLM = Filename: Captain America vol 1 283 (c2ce-dcp)_page_35.png Score : 0.9489 Duration: 22.05 s ---------------------------------------- Filename: Marvel Universe v1 002_page_2.png Score : 0.9303 Duration: 22.05 s ---------------------------------------- Filename: Marvel Universe v1 002_page_29.png Score : 0.9256 Duration: 22.05 s ---------------------------------------- Filename: Captain America vol 1 284 (c2ce-dcp)_page_31.png Score : 0.9251 Duration: 22.05 s ---------------------------------------- Filename: Marvel Universe v1 004_page_3.png Score : 0.9017 Duration: 22.05 s

Direct Search

Reranking Search

= Query: batman say 'Good shot robin, and now we'll see who our masked mystery man is' =
Filename: 3_page_235.png
Score : 0.6637
Duration: 25.72 s
----------------------------------------
Filename: 3_page_250.png
Score : 0.6624
Duration: 25.72 s
----------------------------------------
Filename: 3_page_20.png
Score : 0.6591
Duration: 25.72 s
----------------------------------------
Filename: 3_page_234.png
Score : 0.6581
Duration: 25.72 s
----------------------------------------
Filename: 1_page_234.png
Score : 0.6431
Duration: 25.72 s
----------------------------------------

= Query: Limit of LLM =
Filename: Captain America vol 1 283 (c2ce-dcp)_page_35.png
Score : 0.9489
Duration: 22.05 s
----------------------------------------
Filename: Marvel Universe v1 002_page_2.png
Score : 0.9303
Duration: 22.05 s
----------------------------------------
Filename: Marvel Universe v1 002_page_29.png
Score : 0.9256
Duration: 22.05 s
----------------------------------------
Filename: Captain America vol 1 284 (c2ce-dcp)_page_31.png
Score : 0.9251
Duration: 22.05 s
----------------------------------------
Filename: Marvel Universe v1 004_page_3.png
Score : 0.9017
Duration: 22.05 s

Iteration 3

Now indexed pooling_rows and pooling_cols and scaler, but no original indexing.

Direct Search	Reranking Search
Query: batman say 'Good shot robin, and now we'll see who our masked mystery man is' Filename: 1_page_272.png Score : 0.6503 Duration: 21.34 s ---------------------------------------- Filename: 2_page_60.png Score : 0.6483 Duration: 21.34 s ---------------------------------------- Filename: 6_page_224.png Score : 0.6477 Duration: 21.34 s ---------------------------------------- Filename: 1_page_268.png Score : 0.6462 Duration: 21.34 s ---------------------------------------- Filename: 1_page_234.png Score : 0.6431 Duration: 21.34 s ---------------------------------------- = Query: Limit of LLM = Filename: Captain America vol 1 383 (1991) (c2ce-dcp)_page_3.png Score : 0.8310 Duration: 20.16 s ---------------------------------------- Filename: Captain America vol 1 415 (1993) (c2ce-dcp)_page_8.png Score : 0.8257 Duration: 20.16 s ---------------------------------------- Filename: Captain America vol 1 412 (1993) (c2ce-dcp)_page_12.png Score : 0.8245 Duration: 20.16 s ---------------------------------------- Filename: Captain America vol 1 407 (1992) (c2ce-dcp)_page_15.png Score : 0.8162 Duration: 20.16 s ---------------------------------------- Filename: Captain America vol 1 382 (1991) (c2ce-dcp)_page_3.png Score : 0.8138 Duration: 20.16 s	= Query: batman say 'Good shot robin, and now we'll see who our masked mystery man is' = Filename: 3_page_235.png Score : 0.6637 Duration: 1.60 s ---------------------------------------- Filename: 3_page_250.png Score : 0.6624 Duration: 1.60 s ---------------------------------------- Filename: 3_page_20.png Score : 0.6591 Duration: 1.60 s ---------------------------------------- Filename: 3_page_234.png Score : 0.6581 Duration: 1.60 s ---------------------------------------- Filename: 1_page_234.png Score : 0.6431 Duration: 1.60 s ---------------------------------------- = Query: Limit of LLM = Filename: Captain America vol 1 283 (c2ce-dcp)_page_35.png Score : 0.9489 Duration: 2.72 s ---------------------------------------- Filename: Marvel Universe v1 002_page_2.png Score : 0.9303 Duration: 2.72 s ---------------------------------------- Filename: Marvel Universe v1 002_page_29.png Score : 0.9256 Duration: 2.72 s ---------------------------------------- Filename: Captain America vol 1 284 (c2ce-dcp)_page_31.png Score : 0.9251 Duration: 2.72 s ---------------------------------------- Filename: Marvel Universe v1 004_page_3.png Score : 0.9017 Duration: 2.72 s

Direct Search

Reranking Search

= Query: batman say 'Good shot robin, and now we'll see who our masked mystery man is' =
Filename: 3_page_235.png
Score : 0.6637
Duration: 1.60 s
----------------------------------------
Filename: 3_page_250.png
Score : 0.6624
Duration: 1.60 s
----------------------------------------
Filename: 3_page_20.png
Score : 0.6591
Duration: 1.60 s
----------------------------------------
Filename: 3_page_234.png
Score : 0.6581
Duration: 1.60 s
----------------------------------------
Filename: 1_page_234.png
Score : 0.6431
Duration: 1.60 s
----------------------------------------

= Query: Limit of LLM =
Filename: Captain America vol 1 283 (c2ce-dcp)_page_35.png
Score : 0.9489
Duration: 2.72 s
----------------------------------------
Filename: Marvel Universe v1 002_page_2.png
Score : 0.9303
Duration: 2.72 s
----------------------------------------
Filename: Marvel Universe v1 002_page_29.png
Score : 0.9256
Duration: 2.72 s
----------------------------------------
Filename: Captain America vol 1 284 (c2ce-dcp)_page_31.png
Score : 0.9251
Duration: 2.72 s
----------------------------------------
Filename: Marvel Universe v1 004_page_3.png
Score : 0.9017
Duration: 2.72 s

Iteration 4

Now indexed pooling_rows and pooling_cols and scaler and original indexing.

original indexing time

Direct Search	Reranking Search
= Query: batman say 'Good shot robin, and now we'll see who our masked mystery man is' = Filename: 5_page_156.png Score : 0.6910 Duration: 0.87 s ---------------------------------------- Filename: 3_page_173.png Score : 0.6894 Duration: 0.87 s ---------------------------------------- Filename: 2_page_159.png Score : 0.6868 Duration: 0.87 s ---------------------------------------- Filename: 2_page_385.png Score : 0.6832 Duration: 0.87 s ---------------------------------------- Filename: Captain America vol 1 400 (1992) (c2ce-dcp)_page_30.png Score : 0.6819 Duration: 0.87 s ---------------------------------------- = Query: Limit of LLM = Filename: Captain America vol 1 382 (1991) (c2ce-dcp)_page_3.png Score : 0.9471 Duration: 0.36 s ---------------------------------------- Filename: Captain America vol 1 406 (1992) (c2ce-dcp)_page_24.png Score : 0.9397 Duration: 0.36 s ---------------------------------------- Filename: Captain America vol 1 407 (1992) (c2ce-dcp)_page_15.png Score : 0.9275 Duration: 0.36 s ---------------------------------------- Filename: Daredevil 157 (03-1979)(HD)(C2C)(RexTyler-DCP)_page_11.png Score : 0.9242 Duration: 0.36 s ---------------------------------------- Filename: 06. Thor 373_page_12.png Score : 0.9238 Duration: 0.36 s ----------------------------------------	= Query: batman say 'Good shot robin, and now we'll see who our masked mystery man is' = Filename: 3_page_235.png Score : 0.6637 Duration: 1.60 s ---------------------------------------- Filename: 3_page_250.png Score : 0.6624 Duration: 1.60 s ---------------------------------------- Filename: 3_page_20.png Score : 0.6591 Duration: 1.60 s ---------------------------------------- Filename: 3_page_234.png Score : 0.6581 Duration: 1.60 s ---------------------------------------- Filename: 1_page_234.png Score : 0.6431 Duration: 1.60 s ---------------------------------------- = Query: Limit of LLM = Filename: Captain America vol 1 283 (c2ce-dcp)_page_35.png Score : 0.9489 Duration: 2.72 s ---------------------------------------- Filename: Marvel Universe v1 002_page_2.png Score : 0.9303 Duration: 2.72 s ---------------------------------------- Filename: Marvel Universe v1 002_page_29.png Score : 0.9256 Duration: 2.72 s ---------------------------------------- Filename: Captain America vol 1 284 (c2ce-dcp)_page_31.png Score : 0.9251 Duration: 2.72 s ---------------------------------------- Filename: Marvel Universe v1 004_page_3.png Score : 0.9017 Duration: 2.72 s

Direct Search

Reranking Search

= Query: batman say 'Good shot robin, and now we'll see who our masked mystery man is' =
Filename: 5_page_156.png
Score : 0.6910
Duration: 0.87 s
----------------------------------------
Filename: 3_page_173.png
Score : 0.6894
Duration: 0.87 s
----------------------------------------
Filename: 2_page_159.png
Score : 0.6868
Duration: 0.87 s
----------------------------------------
Filename: 2_page_385.png
Score : 0.6832
Duration: 0.87 s
----------------------------------------
Filename: Captain America vol 1 400 (1992) (c2ce-dcp)_page_30.png
Score : 0.6819
Duration: 0.87 s
----------------------------------------

= Query: Limit of LLM =
Filename: Captain America vol 1 382 (1991) (c2ce-dcp)_page_3.png
Score : 0.9471
Duration: 0.36 s
----------------------------------------
Filename: Captain America vol 1 406 (1992) (c2ce-dcp)_page_24.png
Score : 0.9397
Duration: 0.36 s
----------------------------------------
Filename: Captain America vol 1 407 (1992) (c2ce-dcp)_page_15.png
Score : 0.9275
Duration: 0.36 s
----------------------------------------
Filename: Daredevil 157 (03-1979)(HD)(C2C)(RexTyler-DCP)_page_11.png
Score : 0.9242
Duration: 0.36 s
----------------------------------------
Filename: 06. Thor 373_page_12.png
Score : 0.9238
Duration: 0.36 s
----------------------------------------

2025-5-6

Key facts about our table

25 000 rows – each is a ColPali page.
multivector columns (pooled_rows, pooled_cols, original) hold ≈1 030 token-vectors per row.
- Only IVF-PQ + cosine is available for multivectors in LanceDB Cloud.
id is high-cardinality (mostly unique). BTREE is the recommended scalar index.
All calls are remote (RemoteTable) → every search is a separate HTTPS round-trip.

Baseline:

no indices

stage	server work	latency
pooled_cols search	flat scan (25 k × 1 030 dots)	≈ 10 s
pooled_rows search	flat scan again	≈ 10 s
original refine	flat scan, then discard rows not in id IN (…) (post-filter, because no scalar idx)	≈ 10 s
total	3 scans × 25 k rows	32 – 35 s

Plan shows full_scan: true, prefilter: false.

Add IVF-PQ on the two pooled columns

What changed – the first two scans become ANN look-ups; each touches only a few hundred centroids + PQ codes.
What did not change –
- The IN (…) filter still has to walk the whole table (no scalar index).
- Distance for the refine step still computed on all 25 k rows because the pre-filter has no rapid way to eliminate them.

stage	latency now
pooled_cols (IVF-PQ, cosine)	≈ 40 ms
pooled_rows (IVF-PQ, cosine)	≈ 40 ms
original refine (flat scan)	≈ 22–25 s
total	22 – 26 s (my log)

The 9–10 s gain we observed is exactly the two full scans you removed.

Add BTREE scalar index on

id

The WHERE id IN (…) PREFILTER now executes via the BTREE in O(|candidates| log N) instead of a table walk.
Pre-filter runs before distance-computation, so only the 50 + 50 ≈ 200 candidate rows survive to the refine step.
- Distance ops: 200 rows × 1 030 ≈ 206 k dot-products → a few ms.
Network overhead still 3 RTTs, but each request body is only a few kB.

stage	latency now
pooled_cols IVF-PQ	40 ms
pooled_rows IVF-PQ	40 ms
original refine on 200 rows	~ 500 ms
total (including 3× HTTPS)	1.6 – 2.7 s

Query-to-query variation (1.6 vs 2.7 s) is mostly:

size of the IN (…) list (Batman query produced ≈180 IDs, LLM query ≈310),
job queuing on the shared Cloud worker,
network round-trip jitter.

Execution plan now shows

VectorSearch
  index: IVF_PQ(cosine)          -- for pooled vectors
  prefilter: true
  row_count: 197                 -- refine touches only ~200 rows
full_scan: false                 -- ✅ no table-wide scan

Why we didn’t index the “original” column

At 25 k pages it was already < 1 s after rerank; indexing “original” would shave only a few hundred ms.
Keeping it flat lets us A/B the quality difference between full token-set vs. pooled ANN embeddings.

If we need even lower latency later (or many more pages), we can build an HNSW on a single-vector surrogate (e.g., CLS-token embedding) or switch prefetch_limit down to 50.

Take-away table

configuration	pooled vectors index	scalar id index	refine rows	total latency
None	✗	✗	25 000	32–35 s
IVF-PQ on pooled	✓	✗	25 000	22–26 s
IVF-PQ + BTREE (your final)	✓	✓	~200	1.6–2.7 s
11:26am

Size of Multi-vector

Original	Pooled
1030 vectors × 128 dimensions = 131,840 float32 numbers	1030 vectors × 128 dimensions = 4,864 float32 numbers
~527.36 KB per image	~19 KB per image
25,000 images 13,184,000,000 bytes	25,000 = 486,400,000 bytes
1030 × 25,000 = 25,750,000 vectors = 25M	38 × 25,000 = 950,000 vectors < 1M
12.28 GB	0.45 GB

Eg:

Reranker. rerank number: 200

2025-5-26

get architecture idea: Multimodal Reterive Architecture

GME-Qwen2VL very good in layout and image recognition
BLIP-2 does not work
Finding a 80 score open source OCR Open Source OCRs

‼️GME-Qwen2-VL model only works on transformers==4.50.0

Solution:
change the gme_inference.py

- # old, breaks on ≥4.51
- inputs_embeds = self.base.model.embed_tokens(input_ids)
+ # new, works on all transformers ≥4.0
+ inputs_embeds = self.base.get_input_embeddings()(input_ids)

2025-5-27

🛠️ Development Log – Multi-Module Retrieve Architecture (MRA)

Focus: Stability testing of architecture, integration of CoPoly, and agent initialization.

✅ 1. Multi-Module Retrieve Architecture (MRA) – Initial Test

Goal: Establish a stable architecture without implementing actual models or agents.
Action Taken:
- Implemented a minimal test version using hardcoded values.
- Skipped real model/agent execution to focus on framework stability.
Result:
- Architecture passed the test with stable performance.
- Demonstrates that MRA can support modularity without failure.

⚙️ 2. CoPoly Integration + Agent Initialization

CoPoly Status:
- Successfully integrated CoPoly model into the architecture.
- It behaves as expected.
- ✅ Only 2 files modified out of ~20 total files → confirms the architecture’s modularity and decoupling goal.
Agent Status:
- Initialized a basic agent using Gemini.
- Current agent behavior is basic and “stupid” due to an underdeveloped prompt.
- Next Step: Improve the prompt and agent logic.

🔧 3. System Functionality & Notes

Voting System: Working correctly.
Ingestion: Working perfectly.
Parameterization: Most values are hardcoded for now, which is acceptable for this stage.
SSH Reminder:
- Since the system is hosted on a remote server, remember to:

# SSH port forward: local 8000 → remote 8000
ssh -L 8000:g3071:8000 [email protected]

This ensures local access to the remote app.

🚧 4. Known Limitations

Agent cannot read images yet.
This will be the top priority for tomorrow.
- Implement image-reading capabilities for the agent.

🎯 Summary

✅ Architecture is stable and modular.
✅ CoPoly integration test passed with minimal change.
✅ Voting and ingestion systems are in place.
❗Agent is initialized but needs smarter logic and image processing.
📝 Note SSH setup for remote development.