Query Extraction Service
Query Extraction Service🔗
This service is only available when Match is enabled for your environment.
Method extractFromUrl🔗
Method call🔗
extractFromUrl(environment, password, url, extractorName) : Query
Description🔗
The query extraction from URL method receives a URL as input and outputs a query as result. Query extraction is creating a query from a semantically parsed document. For example, automatically creating a query from a vacancy to search for suitable candidates. If the parameter extractorName is provided, it uses that specific extractor. Otherwise the configured default extractor is used.
Parameters🔗
Parameter Name | Type | Description |
---|---|---|
environment | string | identifier of a search environment |
password | string | password for the search environment |
url | string | URL of the document to be downloaded and processed by the Textkernel Parser |
extractorName | string | optional. The name of extraction service that will be used for query parsing. Used in case there are multiple (different) query templates set up for the environment. Please contact Textkernel Support for more information. |
Returns🔗
Result Name | Type | Description |
---|---|---|
query | String | See format of the Query Language |
Pre-Condition🔗
- Parsing is set up for your environment.
- The URL must be accessible to Textkernel.
Post-Condition🔗
None.
Error Handling🔗
Error Code | Description |
---|---|
EMPTY_ARGUMENT | One or more mandatory arguments are empty. |
INVALID_PASSWORD | The password is incorrect. |
ENVIRONMENT_NOT_AVAILABLE | The environment is not available (see log-file for possible errors). |
QUERY_EXTRACTION_NOT_AVAILABLE | The query extraction is not available. |
URL_NOT_FOUND | The given external URL cannot be downloaded by the Textkernel Parser. |
QUERY_EXTRACTION_EXECUTION_ERROR | An error occurred while processing the document. |
Method extractFromUrlWithToken🔗
Method call🔗
extractFromUrlWithToken(accessToken, url, extractorName) : Query
Description🔗
This endpoint performs differently depending on the request protocol (http, https, doc) of the url parameter.
doc://
This URI is expected to be in the format as described in Match Queries. In this case it retrieves the document from Search's own docstore and sends it to the extractor to generate a query. If the URI contains an extractorName that one is used, otherwise the endpoint's extractorName parameter gets evaluated, and if neither is set the configured default extractor is used.http://, https://
The URL parameter is expected to be a valid external URL that is forwarded by this service to the configured extractor account with the specified name. If no extractorName is provided, the configured default extractor is used. The generated query is returned.
HTTP Servlet🔗
When the service is called as an HTTP POST servlet instead of a SOAP service call the parameters are:
- URL: https://home.textkernel.nl/match-SearchBox3/queryExtractionUrl
- Parameters: accessToken, url, extractorName.
Method extractFromFile🔗
Method call🔗
extractFromFile(environment, password, filename, fileContent, extractorName) : Query
Description🔗
The query extraction from file method receives a file as input and outputs a query as result. The Textkernel Parser needs to be enabled for the environment to process the file and return a templated result as a query. If extractorName is provided, uses that specific extractor if it is null or does not match to any extractor defined in the environment configuration, default extractor is used.
Parameters🔗
Parameter Name | Type | Description |
---|---|---|
environment | string | identifier of a search environment |
password | string | password for the search environment |
filename | string | Filename of the document to process. |
fileContent | byte array | Binary file content of the document to process. |
extractorName | string | optional. The name of extraction service that will be used for query parsing. Used in case there are multiple (different) query templates set up for your environment. Please contact Textkernel Support for more information. |
Returns🔗
Result Name | Type | Description |
---|---|---|
query | String | See format of the Query Language |
Pre-Condition🔗
- Document Parsing is set up for your environment.
- The file must be in a format that is supported by the Textkernel Parser supports (e.g. docx, PDF, HTML).
Post-Condition🔗
None.
Error Handling🔗
Error Code | Description |
---|---|
EMPTY_ARGUMENT | One or more mandatory arguments are empty. |
INVALID_PASSWORD | The password is incorrect. |
ENVIRONMENT_NOT_AVAILABLE | The environment is not available (see log-file for possible errors). |
QUERY_EXTRACTION_NOT_AVAILABLE | The quey extraction is not available. |
QUERY_EXTRACTION_EXECUTION_ERROR | An error occurred while processing the document. |
Method extractFromFileWithToken🔗
Method call🔗
extractFromFileWithToken(accessToken, filename, fileContent, extractorName) : Query
Description🔗
The method is identical with the above described extract method but requires a valid accessToken coming from the authentication service instead of the environment name, password, and access options parameters.
HTTP Servlet🔗
When the service is called as an HTTP POST servlet instead of a SOAP service call the parameters are:
- URL: https://home.textkernel.nl/match-SearchBox3/queryExtractionFile
- Parameters: accessToken, file, extractorName.