Unable to create full text index using Whoosh

Hello

I use the directive rhodecode-index --instance-name=community-1, create Index, Kill will be displayed at the end, and Index cannot be completely created.

What additional information do I need to provide ?

Any ideas what the problem might be?

SYSTEM INFO

CentOS Linux release 8.0.1905 (Core)

RHODECODE CONTROL VERSION: 1.24.4

  • NAME : community-1

  • STATUS : RUNNING
    logs: /root/.rccontrol/community-1/community.log

  • VERSION : 4.27.1 Community

  • VCS : vcsserver-1

  • URL : http://0.0.0.0:10020

  • CONFIG: /root/.rccontrol/community-1/rhodecode.ini

  • NAME : vcsserver-1

  • STATUS : RUNNING
    log : /root/.rccontrol/vcsserver-1/vcsserver.log

  • VERSION : 4.27.1 VCSServer

  • URL : http://127.0.0.1:10010

  • CONFIG : /root/.rccontrol/vcsserver-1/vcsserver.ini

Most likely this is due to not enough RAM available to build the full text search

Hello RhodeCode Team

My server RAM is 8GB.
How much RAM is recommended to add?

Depends on the size of repositories, maybe try doubling it for indexing ?

Hello RhodeCode Team

I increased the RAM to 16GB and still killed the process halfway through the execution.

ERROR

Is it because the amount of data is too large?
Or the index can be created in segments.
For example: Create svn Index r01~r1000 this time and r1001~r2000 next time.

Hello RhodeCode Team

set

My search_mapping.ini settings

commin_process_limit = 200
repo_limit = 1

The instructions I set:
rhodecode-index --instance-name=community-1 --mapping=path/search_mapping.ini

But doesn’t work,Is there an error in the command?

Hi,

look like there’s a typo in the config. Please make sure this has this keys and structure:

https://code.rhodecode.com/rhodecode-tools-ce/files/default/rhodecode_tools/commands/configs/mapping.ini?at=default

Hello RhodeCode Team

I copied rhodecode-tools-ce Files · rhodecode_tools/commands/configs/mapping.ini · RhodeCode Free Hosting 1 into my Linux search mapping.ini

Set commit_fetch_limit = 100
commit_process_limit = 100
repo_limit = 1

But still cannot get the settings of search mapping.ini

commit info still shows process_limit = 10000 and repo process limit:-1

Hello RhodeCode Team

Supplement: Attached pictures

Hello RhodeCode Team

Is this related to the python version?
My python is 2.7.15

Is there a recommended python version?

python version is ok, we’re not sure what it can be but you can also specify the settings via run, please se --help to see options for invocation.

Hello RhodeCode Team

I used the parameters --commit fetch limit=100 --commit process_limit=100 --repo limit=1 to successfully set the rhodecode-index, but the scan still does not release the memory.

For example, if I set scan commit_process_limit=100, after version 001~100, the memory will not be released and will continue to accumulate.

I’m truly sorry, but I need to trouble you once more.

1 Like

Hello,

Same thing here, I have repos with big files. Even if I skip those with skip_files= and skip_files_content=, the rhodecode-index is still killed prematurely.
Does skip_files really prevent RAM usage? Or is there a way to not kill the process even if out of RAM? (use swap for example)

I personally gave up on indexing file contents (commits only) because of the low amount of RAM on my server. I did this with a command switch instead of the mapping file:

rhodecode-index --index-types=commits

Source: Force WHOOSH to re-index through rcstack - #12 by justinmassiot