Irish-Left-Archive/ILAv2

Fork 0

Text content and OCR #18

New issue

Closed

opened 2025-03-04 10:33:41 +00:00 by aonrud · 1 comment

aonrud commented

2025-03-04 10:33:41 +00:00

Owner

V2 should have some capacity to search the document contents.

Possible solution: on save, use pymupdf to get text content of each PDF page (or OCR, if empty) and save to an additional ItemPageText model

V2 should have some capacity to search the document contents. Possible solution: on save, use pymupdf to get text content of each PDF page (or OCR, if empty) and save to an additional ItemPageText model

aonrud added a new dependency

2025-07-19 08:21:57 +00:00

#21 OCR Quality

aonrud commented

2025-07-19 08:23:03 +00:00

Author

Owner

Close in favour of #21. Search part should be separate.

aonrud removed a dependency

2025-07-19 08:23:12 +00:00

#21 OCR Quality

aonrud closed this issue

2025-07-19 08:23:20 +00:00

No labels

No milestone

No project

No assignees

1 participant

Notifications

Due date

The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference

Irish-Left-Archive/ILAv2#18

No description provided.

Rows
Columns