Search Indexation🔗
Overview🔗
A key feature of the Textkernel Connector is the search indexation engine. This infrastructure copies relevant Salesforce Candidate and Vacancy record data into the Textkernel Search index. This activity is also known as indexation of the records. Once the data is in Textkernel's index it can be accessed using Textkernel's semantic search and match features. The integration provides several features:
- flexible mapping of the Salesforce object fields to the Textkernel index data model;
- importing all candidate and vacancy records, or a subset of them;
- monitoring the indexation status of these records;
- automatically retry the indexation if an error occurred;
- perform partial record updates instead of full updates when a full re-parsing of attached binary documents (eg CV/Resume) is not necessary;
- skip indexation requests if no mapped data fields have changed since the last successful indexation event.
There are 2 indexation engines available in the integration: 1. The standard push indexation architecture, which pushes changes from Salesforce into Textkernel 2. The optional high-throughput indexation addon is a pull architecture, which pulls data from Salesforce into Textkernel
Textkernel Indexing Status tracking🔗
For each Candidate or Vacancy record that must be indexed in Textkernel Search, the integration creates a related-object record Textkernel Indexing Status (the TIS record). The TIS keeps track of the indexation status, whether the record in Textkernel is awaiting an update, whether the record was indexed successfully in Textkernel Search, when it was last indexed, and if the record was not successfully indexed it captures a detailed error message.
Starting with release 5.6.4, the indexation status of a record can be one of the following:
- Up to date: indicates that record has been properly indexed in Textkernel Search and the data is up to date;
- To be indexed: for every new candidate/vacancy record created in Salesforce or existing record that has been updated (and that meets the indexation filter - see below), the status of the related Textkernel Indexing Status record is set to To be indexed. Records with this status are queued and indexed again in Textkernel Search;
- Submitted for indexation (Push): the mechanism that pushes indexation requests from Salesforce to Textkernel is working on the record and it is being actively indexed
- Submitted for indexation (Pull): the HTI pull indexation feature is working on the record and it is being actively indexed
- To be removed: records marked with this status are queued to be removed from Textkernel Search;
- Submitted for removal (Push): the mechanism that pushes removal requests from Salesforce to Textkernel is working on the record and it is being actively removed from the index
- Submitted for removal (Pull): wthe HTI pull indexation feature is working on the record and it is being actively removed from the index
- Removed: candidate/vacancy records whose related Textkernel Indexing Status is set to Removed have been deleted from the Textkernel Search index.
At a regular interval, the integration runs and updates all candidate/vacancy records that are marked with the status To be indexed. These records are queued and updated in Textkernel Search, then the status of the records are set to Up to date if the indexation completed successfully.
This background job makes sure that Candidate and Vacancy records in Textkernel Search are always indexed (or removed) within a few minutes from their change in Salesforce, ensuring that users have always access to the most updated data.
This diagram shows how the Indexing Status values change:
Textkernel Indexing Status record fields🔗
The Textkernel Indexing Status record (TIS) is used to keep track of the execution of the indexation job. There is no need for users or admins to modify these values directly. The integration will automatically manage them. The TIS uses the following fields:
- Candidate: lookup field that contains a reference to the related Candidate record (in the case the Textkernel Indexing Status record refers to a candidate);
- Vacancy: lookup field that contains a reference to the related Vacancy record (in the case the Textkernel Indexing Status record refers to a vacancy);
- Status: the indexation status of the candidate/vacancy record. See listing of possible status values in documentation above.
- Last indexing request: timestamp of the last attempt to index the related candidate/vacancy record;
- Last successful indexing: timestamp of the last successful indexation of the candidate/vacancy record;
- Priority: it indicates the indexation priority of this candidate/vacancy record (see below the paragraph Priority);
- Response code and message: in case of errors, these fields contain useful information that can help the Textkernel Support team to identify what the issue is; for candidate/vacancy records successfully indexed, the response code field is set to 200;
- Error severity code: in case an error occurred while indexing, this field describes the severity of the error where 1 means a temporary error, 2 a configuration error and 3 a permanent error (see below from more information about the errors and the indexing retry mechanism);
- Next retry: if an error occurred while indexing the related record, this field shows the timestamp of the next indexing retry;
- Retry number: if an error occurred while indexing the related record, this field contains the number of times the connector has tried to index the record.
- Full update: Used by the partial updates feature. If true, indicates that a full update (including binary) is needed for the record. This is always set to True when new TK indexing status records are created. This is also set to true if the mapped binary file (eg CV/Resume) has changed.
- Is Locked: (Boolean) deprecated since version 5.6.4
- Last indexing request checksum: checksum of last successful indexation request XML sent to TK; used to check to see whether mapped data has changed since the last request
- Last skipped indexing request: timestamp of the last skipped indexing request (ie, because the checksum of the requested update was the same as the last indexing status request).
- Is Candidate: (Boolean) set to True if this TK indexing status record is/was linked to a Candidate rather than a Vacancy. Set to False if this record is linked to a Vacancy.
- Indexation Request Submission Time: set to current date-time when integration selects a record to be indexed OR removed.
- Next indexing request checksum: will be used in future release
- Index request when busy - indicates that there was a request to update a record to the Textkernel Index when the record was actively being updated or removed. When this happens, the current date-time is stored in this field. When the current indexation update operation completes, the integration checks this field. If there is a pending request it sets the status to "To Be Indexed" instead of "Up to date" or "Removed"
- Removal request when busy - indicates that there was a request to remove a record from the Textkernel Index when the record was actively being updated or removed. When this happens, the current date-time is stored in this field. When the current indexation update operation completes, the integration checks this field. If there is a pending request it sets the status to "To Be Removed" instead of "Up to date"
Excluding records from indexation🔗
The integration can be configured to filter out Candidate or Vacancy records that should not be indexed in Textkernel Search, for instance, expired or inactive vacancies. Filtering is configured in the data mapping admin user interface in the Textkernel managed package.
Automatic error handling and retry mechanism🔗
Temporary or permanent errors can occur when indexing Candidate and Vacancy records in Textkernel Search. When this happens, additional information about the error can be found in the related Textkernel Indexing Status record.
Textkernel classifies the errors based on their severity:
- Severity level 1, temporary error: the error is caused by a temporary outage or degradation of the service. The connector will attempt to index the record again on the next retry. After 10 attempts, the record is put in an error state.
- Severity level 2, recoverable error: this error is caused by a misconfiguration (e.g. wrong credentials) and it requires human intervention to be fixed. Please report this to the administrator of the integration in your organization.
- Severity level 3, permanent error: this error occurs when the Candidate/Vacancy record cannot be indexed in Textkernel Search, for instance because of an unsupported file type of the candidate resume. Retrying will not solve the issue.
When an error of severity level 1 occurs, the connector will automatically attempt to index the record again, for 10 times every 60 minutes. The number of attempts and the time of the next retry are visible in the fields Retry Number and Next Retry of the Textkernel Indexing Status.
Manually forcing indexation🔗
Info
This feature is only available once the admin has configured it.
If you want to force a Candidate or Vacancy record to be immediately indexed or deleted from Textkernel Search, the related Textkernel Indexing Status record offers two actions Update in index and Delete from index, which trigger an immediate update/deletion of the record.
This can be useful if an error occurred and you want to retry immediately to test if it is resolved.
Monitoring indexation🔗
This section requires users to have access to the related object Textkernel Indexing Status.
One of the most common reasons why a Candidate or Vacancy record is not visible or not updated in Textkernel Search is an indexation error. You can check if an error occurred looking at the related Textkernel Indexing Status record.
If the field Status is set to Up to date, the Candidate or Vacancy record has been successfully indexed to Textkernel Search and the cause must be searched somewhere else.
If the field Status is set to To be indexed, we need to investigate further:
- if the record was recently created/updated/deleted and the field Last indexing request is empty, the record has been queued by the connector but still not sent to Textkernel Search. The connector will pick it up and index it in one of the next runs. If you want to force the record to be indexed immediately, you can use the action Update in index (if available in your Salesforce configuration);
- if the field Last indexing request has a timestamp anterior to the record creation/update/deletion, the record has been queued for indexing but not picked up yet. The connector will index it in one of the next runs. Fields like Response Code, Response Message, Error Severity Code, Retry Number and Next Retry refer to the previous indexation request (if the Candidate or Vacancy record is not newly created);
- if the field Last indexing request has a timestamp subsequent to the record creation/update/deletion, an error occurred and the fields Response Code, Response Message, Error Severity Code, Retry Number and Next Retry contain information about the error and when then next indexation retry will happen.
If the Status is set to To be removed, the same logic described above for the field Last indexing request applies.
If the Status is set to Removed, the Candidate or Vacancy record has been successfully deleted from Textkernel Search. Even though the record is not indexed anymore in Textkernel Search, the related Textkernel Indexing Status record is kept for historical tracking purposes.
Tip
Administrators can monitor the status of the indexation from the Monitoring page in the Textkernel App.
This helps to investigate the cause a record is deleted from Textkernel Search in case it should have not.
Exporting records for support investigation🔗
When investigating a problem with indexation, sometimes the problems are data-dependent. To analyze these scenarios, the Textkernel Support team may ask to see the Salesforce record data that triggered a problem.
You can easily export the record data by visiting the related Textkernel Indexing Status record and clicking on the Get TK Record XML and/or Get TK CV/Vacancy File buttons.
- Get TK Record XML: generates the XML file that contains the record information stored in Salesforce;
- Get TK CV/Vacancy File: downloads the binary file attached to the Salesforce record (the candidate resume or a file containing the job description - if any - that is sent to the Textkernel Search! index).
When requested by the Textkernel Support team, please provide both files
Note
The Get TK Record XML and Get TK CV/Vacancy File buttons might need to be added to the page layout if they are not visible in the Textkernel Indexing Status record page.