How to perform indexing at will

#1

Hi,
For our internal use, we are building a solution based on API, mostly the search API.
We are facing the case where a user upload files in a given folder then our BPM launch an API search on that folder to get the assets matching custom fields.
The problem is that between the upload by the user and the search by API, most of the time that API search returns empty result because indexing isn’t done yet. From what I see in log, the indexing is made every minutes by cron.
Is there a way to perform an indexing at will ?

#2

Hi scivray ,

To perform indexing , you can go to Administration >>> Indexing >>> Click into Yes, re-index all files.

Then check the search again.

Hope this helps.

#3

Thank you but that’s another way to perform the API call searchIndex.
In log it performs the same :

{ts ‘2016-12-02 10:04:02’} ---------------------- Removing Collection for rebuild
{ts ‘2016-12-02 10:04:02’} ---------------------- Fetching remote secret key
{ts ‘2016-12-02 10:04:02’} ---------------------- Found key : 452620F6703D42FBAD27AAE57DC8AA54
{ts ‘2016-12-02 10:04:02’} ---------------------- Collection removed for rebuild

But then, the API search found no assets because the indexing isn’t performed.
Example :

{ts ‘2016-12-02 10:04:22’} ---------------------- Starting Search
{ts ‘2016-12-02 10:04:22’} ---------------------- Fetching remote secret key
{ts ‘2016-12-02 10:04:22’} ---------------------- Found key : 452620F6703D42FBAD27AAE57DC8AA54
{ts ‘2016-12-02 10:04:22’} ---------------------- SEARCH STARTING !!!
SEARCH WITH: ( extension:(xlsx) AND customfieldvalue:(C94ADDA87A00453E9CEAA721127FB18ET) ) AND ( folder:(“65F3E8501ACC48F8BBECBEBAEB28ED90”) )
{ts ‘2016-12-02 10:04:23’} ---------------------- START Error on search
±-------------±-
| template | /opt/razuna_tomcat_1_9/tomcat/webapps/razuna-searchserver/api/search.cfc
| tagcontext | {CFML Type::array}
| type | Expression
| detail |
| message | numHits must be > 0; please use TotalHitCountCollector if you just need the total hit count
| errorcode |
| extendedinfo |
| errnumber | com.bluedragon.search.search.SearchFunction
±-------------±-
{ts ‘2016-12-02 10:04:23’} ---------------------- END Error on search

Then finally the cron starts indexing :

{ts ‘2016-12-02 10:04:54’} — Executing Indexing from Cron
{ts ‘2016-12-02 10:04:54’} ---------------------- Starting indexing
{ts ‘2016-12-02 10:04:54’} ---------------------- Grabing hosts and files for indexing
{ts ‘2016-12-02 10:04:54’} ---------------------- Found 11 records to index
{ts ‘2016-12-02 10:04:54’} ---------------------- Checking the lock file for Collection: 1
{ts ‘2016-12-02 10:04:54’} ---------------------- Lock file created for: 1
{ts ‘2016-12-02 10:04:54’} ---------------------- Found 11 consolidated records to index.
{ts ‘2016-12-02 10:04:54’} ---------------------- Starting to index file: 185BD35B585B4FFA8DCAA300B4C12F0C (doc) for host: 1
{ts ‘2016-12-02 10:04:54’} ---------------------- Getting Document: 185BD35B585B4FFA8DCAA300B4C12F0C for host: 1
{ts ‘2016-12-02 10:04:54’} ---------------------- Getting FolderPath: 65F3E8501ACC48F8BBECBEBAEB28ED90 for host: 1
{ts ‘2016-12-02 10:04:54’} ---------------------- Getting Custom Fields: 185BD35B585B4FFA8DCAA300B4C12F0C (files) for host: 1
{ts ‘2016-12-02 10:04:54’} ---------------------- Getting Labels: 185BD35B585B4FFA8DCAA300B4C12F0C (files) for host: 1
{ts ‘2016-12-02 10:04:54’} ---------------------- Getting FILE Document: 185BD35B585B4FFA8DCAA300B4C12F0C for host: 1
{ts ‘2016-12-02 10:04:54’} ---------------------- Added file 185BD35B585B4FFA8DCAA300B4C12F0C (doc) for host: 1 to QoQ

After this index the API search works like a charm and find assets.

On my server it seems like the “real” indexing is performed at every minutes and 54 seconds. It’s this “real” indexing at every minutes and 54 seconds that I look to perform at will.

#4

Hi scivray ,

This indexing time was set to default time by Nitai. I will check with him to see is there a way if it is possible to change.

Thanks.

#5

Ok, thank you.

#6

Hello,

A cron job cannot be executed less than one minute. However, that would not be the solution for you anyhow. The issue you are having is that you are firing off your API search query too soon because Razuna hasn’t processed the uploaded file(s) already.

See, when you upload a file to Razuna, there are many more things happening in the background than simply storing the file. It will process all metadata, process custom fields if enabled create different renditions (again reading and writing metadata) and if an external storage location is being used, needs to move the file there. Only when it successfully moved the file(s) to the final storage location, will it set a flag to start indexing.

As many other customers before, what you need to do is to delay your API call or query the API to see if the record has been indexed already, i.e. build a queue system and fire off API calls when the file is indexed.

Hope this helps.

Cheers,
Nitai
Founder

#7

Ok, thank you for this detailed answer.