Query Extraction Service

Query Extraction Service🔗

This service is only available when Match is enabled for your environment.

Method extractFromUrl🔗

Method call🔗

extractFromUrl(environment, password, url, extractorName) : Query

Description🔗

The query extraction from URL method receives a URL as input and outputs a query as result. Query extraction is creating a query from a semantically parsed document. For example, automatically creating a query from a vacancy to search for suitable candidates. If the parameter extractorName is provided, it uses that specific extractor. Otherwise the configured default extractor is used.

Parameters🔗

Parameter Name	Type	Description
environment	string	identifier of a search environment
password	string	password for the search environment
url	string	URL of the document to be downloaded and processed by the Textkernel Parser
extractorName	string	optional. The name of extraction service that will be used for query parsing. Used in case there are multiple (different) query templates set up for the environment. Please contact Textkernel Support for more information.

Returns🔗

Result Name	Type	Description
query	String	See format of the Query Language

Pre-Condition🔗

Parsing is set up for your environment.
The URL must be accessible to Textkernel.

Post-Condition🔗

None.

Error Handling🔗

Error Code	Description
EMPTY_ARGUMENT	One or more mandatory arguments are empty.
INVALID_PASSWORD	The password is incorrect.
ENVIRONMENT_NOT_AVAILABLE	The environment is not available (see log-file for possible errors).
QUERY_EXTRACTION_NOT_AVAILABLE	The query extraction is not available.
URL_NOT_FOUND	The given external URL cannot be downloaded by the Textkernel Parser.
QUERY_EXTRACTION_EXECUTION_ERROR	An error occurred while processing the document.

Method extractFromUrlWithToken🔗

Method call🔗

extractFromUrlWithToken(accessToken, url, extractorName) : Query

Description🔗

This endpoint performs differently depending on the request protocol (http, https, doc) of the url parameter.

doc:// This URI is expected to be in the format as described in Match Queries. In this case it retrieves the document from Search's own docstore and sends it to the extractor to generate a query. If the URI contains an extractorName that one is used, otherwise the endpoint's extractorName parameter gets evaluated, and if neither is set the configured default extractor is used.
http://, https:// The URL parameter is expected to be a valid external URL that is forwarded by this service to the configured extractor account with the specified name. If no extractorName is provided, the configured default extractor is used. The generated query is returned.

HTTP Servlet🔗

When the service is called as an HTTP POST servlet instead of a SOAP service call the parameters are:

URL: https://home.textkernel.nl/match-SearchBox3/queryExtractionUrl
Parameters: accessToken, url, extractorName.

Method extractFromFile🔗

Method call🔗

extractFromFile(environment, password, filename, fileContent, extractorName) : Query

Description🔗

The query extraction from file method receives a file as input and outputs a query as result. The Textkernel Parser needs to be enabled for the environment to process the file and return a templated result as a query. If extractorName is provided, uses that specific extractor if it is null or does not match to any extractor defined in the environment configuration, default extractor is used.

Parameters🔗

Parameter Name	Type	Description
environment	string	identifier of a search environment
password	string	password for the search environment
filename	string	Filename of the document to process.
fileContent	byte array	Binary file content of the document to process.
extractorName	string	optional. The name of extraction service that will be used for query parsing. Used in case there are multiple (different) query templates set up for your environment. Please contact Textkernel Support for more information.

Returns🔗

Result Name	Type	Description
query	String	See format of the Query Language

Pre-Condition🔗

Document Parsing is set up for your environment.
The file must be in a format that is supported by the Textkernel Parser supports (e.g. docx, PDF, HTML).

Post-Condition🔗

None.

Error Handling🔗

Error Code	Description
EMPTY_ARGUMENT	One or more mandatory arguments are empty.
INVALID_PASSWORD	The password is incorrect.
ENVIRONMENT_NOT_AVAILABLE	The environment is not available (see log-file for possible errors).
QUERY_EXTRACTION_NOT_AVAILABLE	The quey extraction is not available.
QUERY_EXTRACTION_EXECUTION_ERROR	An error occurred while processing the document.

Method extractFromFileWithToken🔗

Method call🔗

extractFromFileWithToken(accessToken, filename, fileContent, extractorName) : Query

Description🔗

The method is identical with the above described extract method but requires a valid accessToken coming from the authentication service instead of the environment name, password, and access options parameters.

HTTP Servlet🔗

When the service is called as an HTTP POST servlet instead of a SOAP service call the parameters are:

URL: https://home.textkernel.nl/match-SearchBox3/queryExtractionFile
Parameters: accessToken, file, extractorName.